Author Archives: gene_x

MCV病毒中的LT与sT蛋白功能

Leave a reply

🧬 一、MCV LT蛋白（Large T antigen）

LT蛋白是MCV生命周期中的核心蛋白之一，主要负责启动病毒DNA复制，同时干扰宿主细胞周期。

✅ LT蛋白的主要功能：

1. 启动病毒DNA复制

识别并结合病毒基因组的复制起始位点（origin of replication）
解链DNA起始区域（即 DNA melting）
招募宿主的DNA复制因子（如DNA聚合酶）
实现病毒DNA在宿主细胞内的有效复制

2. 调控病毒转录

控制不同阶段病毒基因的表达，调节病毒生命周期

3. 干扰宿主细胞周期

结合宿主细胞的细胞周期调控蛋白（如pRb）
迫使宿主细胞进入S期，为病毒复制提供有利环境

4. 在致癌中的潜在作用

Merkel细胞癌（MCC）中常见截短型LT蛋白（truncated LT）
虽然失去复制功能，但仍能结合抑癌蛋白（如pRb）
有助于癌变发生

📌 小结

完整LT蛋白：主要用于病毒复制和病毒生命周期调控
截短LT蛋白：不再复制病毒，但可能通过抑癌机制导致癌变

🧬 二、MCV sT蛋白（small T antigen）

sT蛋白是一个多功能调节因子，在病毒致癌潜能中发挥关键作用，主要通过影响细胞信号通路、蛋白降解机制等方式。

✅ sT蛋白的主要功能：

1. 抑制蛋白磷酸酶2A（PP2A）

PP2A 是抑癌因子
sT 抑制 PP2A → 激活 MAPK、AKT/mTOR 等生长信号通路
促进细胞持续增殖，有利于病毒复制与癌变

2. 促进细胞转化与肿瘤形成

在实验模型中，sT可诱导细胞转化（如软琼脂克隆形成）
是Merkel细胞癌的关键致癌因子之一

3. 干扰蛋白降解系统

与 UBE2C（泛素连接酶）等相互作用
干扰蛋白降解和细胞周期调控

4. 稳定并增强LT蛋白表达

抑制LT蛋白降解，延长其在细胞内的寿命
增强病毒复制与致癌潜能

🧩 小结

抑制PP2A：激活生长信号，促进增殖
稳定LT：增强病毒功能
干扰蛋白降解：打破细胞稳态，助推癌变

🧬 三、PP2A 与 UBE2C 简介（人类基因）

1. PP2A（Protein Phosphatase 2A）

属于人类基因，由PPP2CA和PPP2CB编码
是一种关键的抑癌蛋白磷酸酶复合物
主要功能：调控细胞周期、DNA修复、细胞凋亡
被sT蛋白抑制，有利于病毒复制和细胞癌变

2. UBE2C（Ubiquitin-Conjugating Enzyme E2C）

是人类基因，编码泛素结合酶
参与细胞周期蛋白的泛素化降解
推动细胞从有丝分裂中期进入后期
在癌症中常见高表达，MCV sT可能上调其活性

🔁 四、细胞周期与S期简介

细胞周期主要阶段：

G1期：合成RNA与蛋白质，细胞生长
S期：DNA复制阶段，染色体由单体变为姐妹染色单体
G2期：检查DNA复制错误，准备分裂
M期：有丝分裂，生成两个子细胞
G0期：静止期，不再进入增殖周期

🦠 病毒如何利用S期？

MCV等DNA病毒没有自己的复制系统
必须依赖宿主细胞在S期时提供的复制因子（如DNA聚合酶）
MCV LT蛋白通过抑制pRb等抑癌蛋白，解除细胞周期限制
强迫细胞进入S期，为病毒DNA复制创造有利条件

📌 比喻说明：

细胞 = 工厂
S期 = 工厂开足马力复制DNA
病毒 = 插队的外来订单
LT蛋白 = 强迫工厂加班的经理

五、为什么 Large T 被称为“抗原”而不是“蛋白质”

“Large T”通常指的是 大T抗原（Large T antigen），这是源自多瘤病毒（如 SV40）的一种蛋白质。虽然它本质上是蛋白质，但我们在文献和科研中常称之为“抗原”，原因如下：

1. “抗原”是从其生物学功能角度命名的

“抗原”这个词在这里并不是指它在免疫系统中引发免疫反应的功能（尽管它可以），而是历史上首次被识别时，它是在病毒感染的宿主细胞中被抗体识别出来的。
当时科学家通过免疫学方法发现了这种蛋白，因此称它为“抗原”（antigen）。

2. 它的命名源于病毒学历史传统

SV40 病毒的研究中，科学家识别出几种由病毒编码的重要蛋白，如：
- Large T antigen
- Small t antigen
它们以 “T” 命名是因为它们在肿瘤形成（Tumor）中起作用。
“Antigen” 是早期免疫检测常用的术语，沿用至今。

3. 它的功能远不止是“蛋白质”

Large T antigen 是病毒复制和转化细胞（使其癌变）所必需的多功能蛋白。
它能与宿主细胞多种关键蛋白（如 p53、Rb 蛋白）相互作用，调控细胞周期，干扰肿瘤抑制因子。
因此，它常常作为分子标志物或实验工具蛋白在细胞生物学、癌症研究中被广泛使用。

4. 总结

虽然“Large T antigen”在本质上是一个蛋白质，但由于其最初通过免疫方式被发现，并且它在病毒学和细胞生物学中具有重要的功能性和标志性作用，因此沿用了“抗原（antigen）”的名称。这是一种历史命名和功能导向命名的结合。

🧬 Cadmium Resistance Gene Analysis in Staphylococcus epidermidis HD46

Leave a reply

🎯 Objective

To determine whether the genome of S. epidermidis HD46 contains known genes associated with natural cadmium resistance.

✅ Step 1: Prepare the Genome File

Confirm that the genome file is in a compatible format such as .fasta, .gbk, or .gff + .fna.
If the genome is not annotated, proceed to annotation.

✅ Step 2: Annotate the Genome (if needed)

Option 1: Prokka (command-line tool)

Use Prokka to generate gene predictions and protein sequences.

Example command:

bash
prokka HD46.fasta --outdir prokka_out --prefix HD46

Option 2: RAST (web-based tool)

If you prefer a graphical interface:

Go to RAST Server
Create a free account and log in
Upload your genome in FASTA format
Select the default annotation pipeline
Wait for the annotation to complete (this may take some time)
Download the annotated protein sequences (.faa) and genome features (.gff, .tbl, etc.)

✅ Step 3: Identify Known Cadmium Resistance Genes

Look for genes known to confer cadmium resistance, including:

cadA: a cadmium-translocating P-type ATPase
cadC: regulatory protein for cad operon
czcA / czcD: part of heavy metal efflux systems
arsRBC operon: primarily arsenic-related but may overlap in function
copA: copper ATPase, sometimes associated with cadmium resistance

✅ Step 4: Search for Homologous Genes Using BLAST

Create a local BLAST database

Convert the annotated protein sequences into a searchable format:

makeblastdb -in HD46.faa -dbtype prot

Prepare a FASTA File with Reference Cadmium Resistance Proteins

Download cadA, cadC, and other related protein sequences from NCBI or UniProt.
Save them into a single FASTA file named cadmium_genes.faa.

Run BLASTP:

blastp -query cadmium_genes.faa -db HD46.faa -evalue 1e-5 -outfmt 6 -out cadmium_blast_results.txt

Review the output for significant matches based on:
- Percent identity
- E-value
- Alignment length

✅ Step 5: Use the BacMet Database

BacMet is a curated database of experimentally verified metal resistance genes.

Go to the BacMet download page
Download the BacMet2 protein FASTA file.

Create a local BLAST database:

makeblastdb -in BacMet2_PROTEIN.fasta -dbtype prot

Run BLASTP:

blastp -query HD46.faa -db BacMet2_PROTEIN.fasta -evalue 1e-5 -outfmt 6 -out bacmet_hits.txt

Analyze the output to identify any matches to known cadmium resistance genes.

✅ Step 6: Optional – Analyze with CARD

Although CARD primarily focuses on antibiotic resistance, some metal resistance genes are included.

Visit the CARD RGI tool
Upload your .faa file (e.g., HD46.faa)
Review the hits for any relevant metal resistance gene annotations

✅ Step 7: Visual Inspection of Gene Clusters

Use a genome browser like Artemis, Geneious, or SnapGene to open the annotated genome.

Look for:

Clusters of metal resistance genes (e.g., cadA, cadC, czcA/D)
Regulatory genes upstream (e.g., cadC before cadA)
Adjacent stress response or membrane transport genes

These patterns may suggest operon-like organization and co-regulation.

✅ Step 8: Domain Confirmation

To verify if candidate genes are functionally related to cadmium resistance:

Use InterProScan
Or NCBI’s CD-Search

Look for domains such as:

Heavy-metal-associated (HMA) domains
P-type ATPase domains
ArsR-type helix-turn-helix regulators

✅ Step 9: Interpret Results

A strong match to cadA or cadC with conserved domains indicates cadmium resistance potential.
Presence of gene clusters supports likely functionality.
Distant or partial homologs may require experimental validation.
Absence of canonical genes doesn’t exclude alternative mechanisms of resistance.

🔧 Tools Mentioned

Prokka – Command-line annotation
RAST – Web-based annotation
BLAST+ – Protein similarity search
BacMet – Metal resistance gene reference
CARD – Antibiotic and metal resistance database
InterProScan / CD-Search – Domain analysis
Artemis / Geneious – Genome browsing and visualization

镉抗性基因

镉抗性基因是指能够使生物体（如细菌、植物或其他微生物）在含有有毒重金属镉的环境中生存和繁殖的基因。这些基因通常编码一些蛋白质，帮助细胞排除、解毒或固定镉离子，从而减轻镉的毒性影响。

镉抗性基因的主要机制：

外排泵（Efflux pumps）

这类蛋白能够将镉离子主动泵出细胞外，降低细胞内镉的浓度。例如，cadA基因编码一种ATP驱动的镉离子外排蛋白。

金属结合蛋白

如金属硫蛋白（metallothionein），可以结合镉离子，使其失去毒性。

调控蛋白

这些蛋白调节镉抗性基因的表达，确保在有镉存在时激活抵抗机制。

典型例子：

cadA基因

最早在金黄色葡萄球菌的质粒中发现，能编码一种专门将镉离子泵出细胞的蛋白。

czc系统

存在于某些细菌中（如金属耐受铜绿假单胞菌），可以同时抵抗镉、锌和钴的外排系统。

金属硫蛋白

广泛存在于植物和微生物中，能结合镉离子，防止其对细胞的损伤。

如果你想了解更详细的基因序列、具体的生物种类或者镉抗性的应用，告诉我，我可以帮你进一步说明！

Cadmium 意思

“Cadmium” 的意思是镉，是一种化学元素，符号是 Cd，原子序数是 48。

性质：银白色的金属，有延展性和柔软性。
毒性：镉是一种有毒的重金属，对人体和环境都有害，长期接触会导致中毒。
用途：主要用于制造镉镍电池、电镀和颜料等。

所以，“Cadmium” 就是指这种元素“镉”。

Stihl Elektro-Heckenschneider

Leave a reply

Produktinformation 14 Bewertungen Vor-und Nachteile Hohe Leistung Leise elektrische Heckenschere Schneidleistung durch beidseitig geschärfte Klinge Verlängerungskabel erforderlich Lieferumfang 1x Stihl Elektro-Heckenschere HLE 71 K 1 x Gebrauchsanleitung Beschreibung Die Stihl HLE 71 K Elektrische Heckenschere ist ein vielseitiges Werkzeug mit einem kurzen Stiel. Mit ihrer leistungsstarken 600-Watt-Motorleistung bietet diese Heckenschere die erforderliche Kraft zum Schneiden hoher und breiter Hecken. Dank des 125° stufenweise verstellbaren Messerbalkens können Sie flexibel vom Boden aus bequem schneiden. Diese Heckenschere eignet sich auch ideal für die Pflege von Bodendeckern und Sträuchern.

Kraftvoll, effizient & leise Mit dem beidseitig geschliffenen Messer bietet die Stihl HLE 71 K eine kraftvolle Schneidleistung. Sie können schnell und effizient schneiden, ohne viel Aufwand zu betreiben. Darüber hinaus ermöglicht der 125° schwenkbare Messerbalken präzise Schnitte, sodass Sie das gewünschte Ergebnis erzielen können.

Neben ihrer Kraft und Vielseitigkeit ist die Stihl HLE 71 K auch eine leise elektrische Heckenschere mit einem kurzen Stiel. Dadurch können Sie Schnittarbeiten durchführen, ohne andere zu stören.

Sicherheit an erster Stelle Sicherheit hat bei Stihl einen hohen Stellenwert, und die HLE 71 K ist mit Überlastschutz ausgestattet. Dies gewährleistet zusätzliche Sicherheit während der Verwendung. Sie können also bedenkenlos mit dieser Heckenschere arbeiten.

Teleskopstiel Die Stihl HLE 71 K Elektrische Heckenschere mit Teleskopstiel ist die ideale Wahl für jeden, der nach einem leistungsstarken, vielseitigen und benutzerfreundlichen Werkzeug zur Heckenpflege sucht. Ob Sie ein begeisterter Gärtner oder ein Profi sind, diese Heckenschere bietet die Leistung und Zuverlässigkeit, die Sie benötigen, um Ihren Garten bestmöglich aussehen zu lassen.

Spezifikationen

        Marke   Stihl
        Artikelnummer   48130112908
        Barcode 795711401771
        Schnittbewegungen pro mn    4000
        Messertyp   Doppelseitig
        Schnittkapazität (in mm)    35
        Antivibrationssystem    Nein
        Drehbarer Griff Nein
        Klingenlänge (in cm)    50
        Heckenschere am Griff   Ja
        Geräuschpegel (in dB)   85
        Motortyp    Kohlebürsten
        Leistung (in Watt)  600
        Gesamtlänge 211 cm

Stihl Elektro-Heckenschneider HLE 71 K Stihl Elektro-Heckenschneider HLE 71 K

Spezifikationen Marke Stihl Artikelnummer 48130112908 Barcode 795711401771 Schnittbewegungen pro mn 4000 Messertyp Doppelseitig Schnittkapazität (in mm) 35 Antivibrationssystem Nein Drehbarer Griff Nein Klingenlänge (in cm) 50 Heckenschere am Griff Ja Geräuschpegel (in dB) 85 Motortyp Kohlebürsten Leistung (in Watt) 600 Gesamtlänge 211 cm

TODO: 买有电的，因为边界已经有两个插座了，不会太麻烦，而且功率大！Schnittleistung 也大。Stihl HLE 71 600W 2.54Mtr Long Reach Hedgetrimmer

# ---- Stihl Elektro-Heckenschneider HLE 71 L (€ 449,00) ----

<a href="https://kaisers.jetzt/products/stihl-elektro-heckenschneider-hle-71-l?kendall_source=google&#038;kendall_campaign=21209252338&#038;kendall_adid=&#038;gad_source=1&#038;gad_campaignid=21205673891&#038;gbraid=0AAAAADxVzG7DAsJxdsViEhvAGo8KEuLIe&#038;gclid=CjwKCAjwprjDBhBTEiwA1m1d0pdG4sXyHyuz_6GOSL8CnEhuzpZyt3jXcq5HfqK8UmTem7Go_PMJ6xoCgG8QAvD_BwE">Stihl Elektro-Heckenschneider HLE 71 L</a>

Vor-und Nachteile
    Sehr ruhig
    Für anspruchsvolle Privatanwender und Profis
    Hohe Schneidleistung
    Verlängerungskabel erforderlich

Spezifikationen
Marke   Stihl
Artikelnummer   48130112909
Barcode 795711401788
Motortyp    Kohlebürsten
Leistung (in Watt)  600
Schnittleistung (in mm) 35
Antivibrationssystem    Nein
Drehbarer Griff Nein
Klingenlänge (in cm)    50
Heckenschere am Griff   Ja
Geräuschpegel (in dB)   84
Gesamtlänge 254cm

# ---- STIHL HLA 66 mit Akku AP 200 und Ladegerät AL 101 (519.00 €) ----

Akku-Heckenschere, AP-System Artikel-Nr.: 90006171

https://www.passiontec.de/stihl-hla-66-mit-akku-ap-200-und-ladegeraet-al-101.html#

#------ STIHL HLE 71 K ------
SONDERPREIS
Gerät ohne Umverpackung!

Leistungsstarker 600 Watt-Elektro-Heckenschneider in der Kurzversion mit um 125° stufenweise verstellbarem Messerbalken. Flexibles Schneiden hoher und breiter Hecken bequem vom Boden aus. Sehr gut für Schneidarbeiten direkt an der Hecke oder aus der Distanz.

Bedarfsgerechte Schaftlängen
Die Heckenschneider mit langem Schaft werden vor allem bei hohen und breiten Hecken eingesetzt. Die kürzeren K-Versionen mit verlängertem Griffschlauch sind handliche Geräte für die Bearbeitung niedriger Hecken.

Rundumgriff
Mit dem praktischen Rundumgriff lässt sich das Gerät einfach und präzise führen. Die Griffposition des Rundumgriffs ist auf die individuelle Körpergröße und auf unterschiedliche Schnittaufgaben einstellbar. (Abb. ähnlich)

Kabelzugentlastung
Die Kabelzugentlastung sorgt dafür, dass die Steckverbindung beim Nachziehen nicht versehentlich getrennt wird (Abb. ähnlich).

Schnellverstellsystem 125°
Der Messerbalken lässt sich stufenweise um bis zu 135° in zwei Richtungen einstellen und kann für den Transport parallel zum Schaft eingeklappt und arretiert werden (Transportstellung).

Softgriff
Der Softgriff absorbiert einen Teil der Vibrationen und ist außerordentlich grifffreundlich (Abb. ähnlich).

🧰 Stihl HLE 71 K vs HLE 71 L
Allgemeines Modell (HLE 71)
– 600 W Elektro‑Motor, 50 cm zweischneidiges Messer, drehbare Messerstange bis 125 °
– Stroke-Rate: 4 000 Schnitte/min; ideal für hohe Hecken vom Boden aus 
kenzastihl.com
+7
justlawnmowers.co.uk
+7
delourmel-jardinage.com
+7

Unterschied in Längen & Gewicht
– HLE 71 K („Kurzversion“): Gesamtlänge ca. 211 cm, Gewicht rund 5,6 kg 
ets-laurent.com
+7
worldofpower.co.uk
+7
keizers.nu
+7

– HLE 71 L („Langversion“): Gesamtlänge ca. 254 cm, Gewicht ca. 5,9 kg 
justlawnmowers.co.uk
+3
keizers.nu
+3
ets-laurent.com
+3

👷‍♂️ Praxisrelevanz:
– K-Version eignet sich für niedrigere oder schwer zugängliche Hecken, handlicher im Gebrauch
– L-Version ist besser bei hohen oder breiten Hecken – zusätzliche Länge erleichtert Arbeiten ohne Leiter, Gewicht bleibt ähnlich komfortabel

📝 Zusammenfassung
Modell  Länge   Gewicht Einsatzbereich
HLE 71 K    211 cm  5,6 kg  Niedrige Hecken, eingeschränkte Plätze
HLE 71 L    254 cm  5,9 kg  Hohe Hecken, komfortables Arbeiten aus Distanz

Beide bieten gleiche Leistung (600 W, 4 000 Schnitte/min) – die Wahl hängt vom Arbeitseinsatz ab: kompakt und agil (K) versus extra Reichweite (L).

🧭 Use-case Comparison
Feature HL 91 KC‑E  HL 94 C‑E
Reach   ~1.7 m total length ~2.42 m total length
Weight  5.4–5.8 kg  6.1–6.2 kg
Blade Adjustment    130° angle  145° angle
Best For    Detailed, close-up hedge work   Tall or wide hedges without ladder
ErgoStart & 2‑MIX   Both models include these   Both models include these
Variable Speed (ECOSPEED)   —   ✔ (on-handle control)

⚙️ Which One Should You Choose?
Choose the HL 91 KC‑E if your work involves short to mid‑height hedges, topiary, or detailed trimming. It’s more maneuverable and lighter.

Opt for the HL 94 C‑E when managing tall or expansive hedges from ground level—the added length and reach save ladder time.

# ----

https://www.ebay.de/itm/317038395095?chn=ps&_ul=DE&norover=1&mkevt=1&mkrid=707-134425-41852-0&mkcid=2&mkscid=101&itemid=317038395095&targetid=2352311494706&device=c&mktype=pla&googleloc=9043454&poi=&campaignid=22418870055&mkgroupid=181487483647&rlsatarget=aud-1683823397950:pla-2352311494706&abcId=10262383&merchantid=113801033&gad_source=1&gad_campaignid=22418870055&gbraid=0AAAAAD_G4xZpXl-cR9-Nb5g9fY9yyPiuf&gclid=CjwKCAjwprjDBhBTEiwA1m1d0g0xhkpE_GIdrHWXOTC1U502fPZBfFdpDG9Ob9aOHsiCyQhsGGRr6hoCKjoQAvD_BwE

# ---- Stihl AK Akku-Hochentaster HLA56 ----

https://www.bauhaus.info/akku-hochentaster/stihl-ak-akku-hochentaster-hla56/p/32541104?utm_source=google&utm_medium=ssa&utm_id=8995092923_160945286067&cid=SSAGoo8995092923_160945286067&gad_source=1&gad_campaignid=8995092923&gbraid=0AAAAADNytnJro5vIkirxpGYmrpTOqwUSU&gclid=CjwKCAjwprjDBhBTEiwA1m1d0rpuBTlb1jY94u7vfLWM2f8ic8lI2DmvKWpAXKb78LqrbMXWwvpfQhoCTTsQAvD_BwE

Ideal für hohe Hecken – Schneidet komfortabel über Kopf und in Bodennähe dank langem Schaft
Flexibler Schneidkopf – Messerwinkel werkzeuglos von -45° bis +90° einstellbar
Präzise Tropfenform-Messer – Hält Äste sicher für saubere und kontrollierte Schnitte
Teilbarer Schaft – Für einfachen Transport und platzsparende Aufbewahrung
01.05-31.10.2025: Ausgewähltes Stihl AK-System Produkt kaufen, online registrieren und Cashback sichern*

Die einseitig geschliffenen Messer mit 30 mm Zahnabstand

# ----- Husqvarna 520IHE3

https://www.galaxus.de/de/s4/product/husqvarna-520ihe3-akkuheckenschere-heckenschere-21800954?utm_campaign=preisvergleich&utm_source=idealo&utm_medium=cpc&utm_content=2705624&supplier=2705624

# ------ 1 x Stihl HL 91 KC-E ----

Spezifikationen
Schnittbewegungen pro Mio   3615
Messertyp   Doppelseitig
Schnittkapazität (in mm)    34
Antivibrationssystem    Nein
Drehbarer Handgriff Nein
Messerlänge (in cm) 60
Heckenschere auf Stiel  Ja
Schalldruckpegel (dB(A))    92
Motortyp    2-Takt
Hubraum (cc)    24,1
Motorleistung (kW)  0,9

# ----- Stihl HL94C-E

Spezifikationen
Schnittbewegungen pro Minute    3615
Klingentyp  Doppelseitig
Schnittkapazität (in mm)    34
Antivibrationssystem    Nein
Drehgriff   Nein
Klingenlänge (in cm)    60
Heckenschere am Griff   Ja
Geräuschpegel (in dB)   91
Motortyp    2-Takt
Hubraum (in cm³)    24,1
Motorleistung (in kW)   0,9

Beschreibung
geeignet für professionelle Arbeiten an hohen Hecken und den Einsatz in Bodennähe
2-MIX-Motor mit ECOSPEED zur Drehzahlregulierung für lange Arbeitsintervalle. Gewichtsreduziertes Getriebe
um 145° schwenkbarer Messerbalken und langem Schaft
der STIHL ErgoStart ermöglicht ein besonders komfortables Anwerfen der Maschine
mit der manuellen Kraftstoffpumpe lässt sich auf Daumendruck Kraftstoff in den Vergaser fördern
serienmäßig mit Traggurt, erleichtern besonders bei Langzeiteinsätzen die Arbeit
Gesamtlänge: 242 cm
Gewicht: 6,2 kg
Hubraum: 24,1 cm³
Marke: STIHL
Motorleistung: 0,9 kW
Motorleistung: 1,2 PS
Schnittlänge: 60 cm
Zahnabstand: 34 mm

#----

Mit einer Gesamtlänge von 2,1 Metern können Sie bis zu einer Höhe von 4 Metern schneiden.

<a href="https://kaisers.jetzt/products/stihl-akku-heckenschneider-hla-56-inkl-akku-ak-20-und-ladegeraet">Stihl Akku-Heckenschneider HLA 56 inkl. Akku (AK 20) und Ladegerät</a>

ieferumfang
    1 x Stihl HLA 56 Einzelgerät
    1 x Stihl AK 20 Akku
    1 x Stihl AL 101 Ladegerät
    1 x Wandhalterung
    1 x Gebrauchsanleitung

Stihl Akku-Heckenschneider HLA 56 inkl. Akku (AK 20) und Ladegerät

Marke   Stihl
Artikelnummer   HA012000050
Barcode 795711985714
Schnittbewegungen pro Minute    2800
Klingentyp  Einseitig
Schnittkapazität (in mm)    30
Antivibrationssystem    Nein
Drehgriff   Nein
Klingenlänge (in cm)    45
Heckenschere am Griff   Ja
Geräuschpegel (in dB)   77
Motortyp    Induktion
Akkulaufzeit (in Minuten)   50
Ladezeit auf 100 % pro Akku (in Minuten)    95
Autonomie pro Batterieladung (in m2, empfohlene Batterie)   380

https://www.agrieuro.de/2takt-benzin-heckenschere-geotech-gt-2-58-58-cm3-p-29087.html?utm_source=google&utm_medium=cpc&utm_campaign=PMaxPROFIT:IRRORAZIONE_PRIMAVERA-ESTATE(EX_Pompe_irroratrici)&ads_campaign=PMaxPROFIT:IRRORAZIONE_PRIMAVERA-ESTATE(EX_Pompe_irroratrici)&gad_source=1&gad_campaignid=19472668513&gbraid=0AAAAACzKbr_9KL1zu9qZGvEVvjh8ptD95&gclid=CjwKCAjwg7PDBhBxEiwAf1CVu1SsLhvH1OCbC56NTkvv6Coc_KNp03TSbMuohXr7fRNEUC3yw9QmfhoC8ukQAvD_BwE

#---------
VEVOR 26CC 2-Takt-Benzin-Heckenschere Heckentrimmer Langstielheckenschere 0,54L
https://www.ebay.de/itm/266910026586?chn=ps&_ul=DE&_trkparms=ispr%3D1&amdata=enc%3A1r8HHYJ2uS96bLrD-i1_BLg45&norover=1&mkevt=1&mkrid=707-173151-927826-9&mkcid=2&mkscid=101&itemid=266910026586&targetid=2381626844604&device=c&mktype=pla&googleloc=9191127&poi=&campaignid=22425403106&mkgroupid=181487483687&rlsatarget=aud-1683823398150:pla-2381626844604&abcId=10262377&merchantid=112867921&gad_source=1&gad_campaignid=22425403106&gbraid=0AAAAAD_G4xYAwKIuRqGYj6n0bz5iGErzq&gclid=CjwKCAjwg7PDBhBxEiwAf1CVu_EhXxgTbWurcyOmzj8EwFuxwtSBFutsxS5j6FXELumD-OU2Z1kN2RoCqLUQAvD_BwE

Setup the environment for lumicks-pylake and C_Trap-Multimer-photontrack.ipynb

Leave a reply

https://lumicks-pylake.readthedocs.io/en/v0.8.2/install.html#updating

https://github.com/JamesLiWan/MultimerizationCode/blob/main/C_Trap-Multimer-photontrack.ipynb

Fix which pylake we should use for the legacy-code.

 Yep – you’re absolutely right, the channel parameter was added in pylake v0.9.0. If you look at the changelog:

 v0.9.0 (Jul 29 2021) introduces unit support and other updates, which also aligns with introducing the channel argument in track_greedy()

 Versions before v0.9.0, like v0.8.2 (Apr 30 2021), use the old signature:

 track_greedy(kymograph, line_width, pixel_threshold, …)
 with no channel argument.

 Since your code uses track_greedy(data, …) without specifying channel, you should install: pylake v0.8.2

Create mamba environment for pylake v0.8.2

 mamba env remove -n pylake_082_clean
 mamba env create -f pylake_082_env.yml

     name: pylake_082_clean
     channels:
     - conda-forge
     - defaults
     dependencies:
     - python=3.8
     - pip
     - jupyterlab
     - ipywidgets
     - numpy
     - matplotlib
     - h5py
     - pandas
     - scipy

 mamba activate pylake_082_clean

 #IMPORTANT:
 mamba install -c conda-forge lumicks.pylake=0.8.2
 mamba install -c conda-forge  numpy matplotlib ipywidgets scipy scikit-image pandas

 python -c "import lumicks.pylake; print(lumicks.pylake.__version__)"

 mamba install notebook=6

 #+ ipython_genutils    0.2.0  pyhd8ed1ab_1  conda-forge     Cached
 #+ nbclassic           1.1.0  pyhd8ed1ab_0  conda-forge     Cached
 #+ notebook            6.5.7  pyha770c72_0  conda-forge     Cached

 #Downgrade:
 #─────────────────────────────────────────────────────────────────────

 #- jupyter_client      8.6.3  pyhd8ed1ab_0  conda-forge     Cached
 #+ jupyter_client      7.4.9  pyhd8ed1ab_0  conda-forge     Cached

 jupyter notebook --version
 #pip show notebook

 (pylake_082_clean) jhuang@WS-2290C:/mnt/md1/DATA/Data_Vero_Kymographs$ mamba list jupyter
 # packages in environment at /home/jhuang/mambaforge/envs/pylake_082_clean:
 #
 # Name                    Version                   Build  Channel
 jupyter-lsp               2.2.5              pyhd8ed1ab_0    conda-forge
 jupyter_client            7.4.9              pyhd8ed1ab_0    conda-forge
 jupyter_core              5.8.1              pyh31011fe_0    conda-forge
 jupyter_events            0.10.0             pyhd8ed1ab_0    conda-forge
 jupyter_server            2.14.2             pyhd8ed1ab_0    conda-forge
 jupyter_server_terminals  0.5.3              pyhd8ed1ab_0    conda-forge
 jupyterlab                4.3.0              pyhd8ed1ab_0    conda-forge
 jupyterlab_pygments       0.3.0              pyhd8ed1ab_1    conda-forge
 jupyterlab_server         2.27.3             pyhd8ed1ab_0    conda-forge
 jupyterlab_widgets        3.0.13             pyhd8ed1ab_0    conda-forge
 (pylake_082_clean) jhuang@WS-2290C:/mnt/md1/DATA/Data_Vero_Kymographs$ mamba list notebook
 # packages in environment at /home/jhuang/mambaforge/envs/pylake_082_clean:
 #
 # Name                    Version                   Build  Channel
 notebook                  6.5.7              pyha770c72_0    conda-forge
 notebook-shim             0.2.4              pyhd8ed1ab_0    conda-forge

 #Note that Notebook v6.x and Notebook v7 is different, we have to notebook v6 More Detailed Breakdown:
 #Version    Backend Used    Notes
 #Notebook v6.x and below    notebook (classic Tornado-based server) No jupyter_server dependency
 #Notebook v7+   Uses jupyter_server under the hood  Required dependency

Install extension for the recognization of %%javascript, %matplotlib inline, and %matplotlib notebook

 mamba install widgetsnbextension -c conda-forge
 jupyter nbextension enable --py widgetsnbextension --sys-prefix
 #ModuleNotFoundError: No module named 'ipympl', but we don't need it!    jupyter nbextension enable --py ipympl --sys-prefix
 jupyter nbextension list

BUG_1: ModuleNotFoundError: No module named ‘matplotlib_venn’
```
 mamba install -c conda-forge matplotlib-venn
```

BUG_2: TypeError: init() got an unexpected keyword argument ‘drawtype’

 #indicates that your version of Matplotlib is too new, and it no longer supports the drawtype argument in matplotlib.widgets.RectangleSelector. Only <= 3.4.3 supprts drawtype!

 mamba list matplotlib
 # packages in environment at /home/jhuang/mambaforge/envs/pylake_082_clean:
 #
 # Name                    Version                   Build  Channel
 #matplotlib                3.7.3            py38h578d9bd_0    conda-forge
 #matplotlib-base           3.7.3            py38h58ed7fa_0    conda-forge
 #matplotlib-inline         0.1.7              pyhd8ed1ab_0    conda-forge
 #matplotlib-venn           1.1.1              pyhd8ed1ab_0    conda-forge

 #DEBUG:         # MODIFIED: Delete "drawtype='box'," in kymowidget.py
 self.area_selector = RectangleSelector(self._axes, self.track_kymo,
                                         useblit=True,
                                     button=[3],
                                     minspanx=5, minspany=5,
                                     spancoords='pixels',
                                     interactive=False)

OPTIONAL_BUG (NOT_DONE): [W 13:13:49.115 NotebookApp] Error loading server extension jupyterlab

 #Traceback (most recent call last):
 #  File "/home/jhuang/mambaforge/envs/pylake_082/lib/python3.8/site-packages/notebook/notebookapp.py", line 2050, in init_server_extensions
 #    func(self)
 #  File "/home/jhuang/mambaforge/envs/pylake_082/lib/python3.8/site-packages/jupyterlab/serverextension.py", line 71, in load_jupyter_server_extension
 #    extension.initialize()
 #  File "/home/jhuang/mambaforge/envs/pylake_082/lib/python3.8/site-packages/jupyterlab/labapp.py", line 921, in initialize
 #    super().initialize()
 #  File "/home/jhuang/mambaforge/envs/pylake_082/lib/python3.8/site-packages/jupyter_server/extension/application.py", line 437, in initialize
 #    self._prepare_handlers()
 #  File "/home/jhuang/mambaforge/envs/pylake_082/lib/python3.8/site-packages/jupyter_server/extension/application.py", line 327, in _prepare_handlers
 #    self.initialize_handlers()
 #  File "/home/jhuang/mambaforge/envs/pylake_082/lib/python3.8/site-packages/jupyterlab/labapp.py", line 735, in initialize_handlers
 #    page_config["token"] = self.serverapp.identity_provider.token
 #AttributeError: 'NotebookApp' object has no attribute 'identity_provider'

 jupyter serverextension disable jupyterlab
 mamba remove jupyterlab (or pip uninstall jupyterlab)

Run notebook on (pylake_082_clean) jhuang@WS-2290C:/mnt/md1/DATA/Data_Vero_Kymographs/MultimerizationCode-main_v082

 (pylake_082_clean) jhuang@WS-2290C:/mnt/md1/DATA/Data_Vero_Kymographs/MultimerizationCode-main_v082$ jupyter notebook

 #jupyter notebook --no-browser --port=8888

📊 Transposon Analysis: Sequencing Depth Requirements

Leave a reply

🔬 Purpose of Analysis

Tn-seq / TraDIS / INSeq
Used for mapping transposon insertion sites across the genome.
Recommended coverage: 500× to 1000×
Native transposon (IS) detection
Detects existing mobile genetic elements such as insertion sequences (IS) or integrons.
Recommended coverage: 30× to 50×
Structural variant (SV) or mobility analysis
Identifies large insertions, deletions, or mobile element insertions.
Recommended coverage: at least 50×
Abundance or fitness estimation
Quantifies insertion events under different experimental conditions.
Recommended coverage: 200× to 300×
Long-read mobile element assembly
Uses technologies like Oxford Nanopore (ONT) or PacBio to resolve repetitive mobile elements.
Recommended coverage: 20× to 30×

🧬 Your Data (Example Samples)

Sample 1
Total bases: 2.0 Gbp
Mean read length: 8,635 bp
Estimated read count: ~233,000
Approximate genome coverage: ~500×
Sample 2
Total bases: 2.5 Gbp
Mean read length: 6,851 bp
Estimated read count: ~366,000
Approximate genome coverage: ~625×
Sample 8
Total bases: 1.26 Gbp
Mean read length: 5,180 bp
Estimated read count: ~243,000
Approximate genome coverage: ~250×
Sample WT
Total bases: 1.88 Gbp
Mean read length: 10,910 bp
Estimated read count: ~172,000
Approximate genome coverage: ~600×

Assuming an average bacterial genome size of ~4 Mb

✅ Conclusion

Your sequencing depth ranges from 250× to 625×, which is:
- Ideal for transposon insertion site mapping
- Suitable for detecting low-abundance or rare insertion events
- Adequate for native mobile element discovery
If your goal is Tn-seq, your current data provides excellent coverage for both insertion site resolution and abundance profiling.

📊 Recommended Sequencing Depths for Transposon Analysis

Tn-seq / TraDIS: Recommended depth is 500× to 1000×.
This high depth is necessary to ensure adequate coverage across insertion sites and to detect low-frequency transposon insertions.
Detection of native insertion sequences (IS) / mobile elements: A depth of 30× to 50× is generally sufficient.
This enables reliable identification of naturally occurring transposable elements in bacterial genomes.
Structural variants (SVs) and comparative genomics: Aim for more than 50× coverage.
Higher coverage improves confidence in detecting large insertions, deletions, or rearrangements during comparative analysis.

From Genbank-ID to Phylogenetic Tree Visualization using roary, raxml and ete3

Leave a reply

http://xgenes.com/article/article-content/371/identify-all-occurrences-of-phage-hh1-mt880870-in-s-epidermidis-st2-genomes-from-public-and-clinical-isolates/

Downloads gff

 #for id in CP009257.1 CP000521.1 CP059040.1 CU459141.1 CU468230.2 CP141284.1 CP131470.1 CP002080.1 CP002177.1 CP041365.1; do
 #  efetch -db nuccore -id ${id} -format gff3 > ${id}.gff;
 #  efetch -db nuccore -id ${id} -format fasta > ${id}.fasta;
 #done

 echo -e "CP009257\nCP000521\nCP059040\nCU459141\nCU468230\nCP141284\nCP131470\nCP002080\nCP002177\nCP041365\nCP191205" > ids.txt

 cat ids.txt | while read id; do
     efetch -db nuccore -id $id -format gff3 > "${id}.gff"
     efetch -db nuccore -id $id -format fasta > "${id}.fasta"
     # Append the FASTA sequence to the GFF3 file with ##FASTA header
     echo "##FASTA" >> "${id}.gff"
     cat "${id}.fasta" >> "${id}.gff"
     rm "${id}.fasta"  # Optionally remove the separate FASTA file

Roary

 mv *.gff roary_input
 (bacto) roary -f roary_out -e --mafft -p 100 roary_input/*.gff

Generate phylogenetic tree using FastTree or raxml

 #iqtree -s core_gene_alignment.aln -m GTR+G -bb 1000 -nt AUTO

 (bacto) FastTree -nt roary_out/core_gene_alignment.aln > roary_out/core_gene_tree.nwk

 (bacto) raxml-ng --all \
   --msa roary_out/core_gene_alignment.aln \
   --model GTR+G \
   --bs-trees 1000 \
   --threads 40 \
   --prefix core_gene_tree_1000

   运行完后，你会得到：

   core_gene_tree_1000.raxml.bestTree : 最佳最大似然树。
   core_gene_tree_1000.raxml.bootstraps : 100 个 bootstrap 树。
   core_gene_tree_1000.raxml.support : 带有 bootstrap 支持值的树（就是你要的）。

Install ete3_env

 mamba create -n ete3_env python=3.10 ete3 -c etetoolkit -y
 mamba activate ete3_env
 mamba install -c etetoolkit ete3 pyqt

Phylogenetic Tree Visualization using ete3

 #(ete3_env) python3 ~/Scripts/label_tree.py roary_out/core_gene_tree.nwk
 (ete3_env) python3 ~/Scripts/label_tree.py core_gene_tree_1000.raxml.support

 #!/usr/bin/env python3

 from ete3 import Tree, TreeStyle, TextFace, NodeStyle, faces
 import sys

 #(ete3_env) jhuang@WS-2290C:~/DATA/Data_Tam_DNAseq_2025_AYE/REVIEW$ python3 ~/Scripts/label_tree.py core_gene_tree_1000.raxml.support

 # -------------------------
 group_colors = {
     "A. baumannii": "#1f77b4",
     "A. junii": "#ff7f0e",
     "A. pittii": "#2ca02c",
     "A. oleivorans": "#d62728",
     "A. tandoii": "#9467bd",
 }

 leaf_name_map = {
     "CP191205": "A. baumannii HKAB-1",
     "CP009257": "A. baumannii AB030",
     "CP000521": "A. baumannii ATCC 17978",
     "CP059040": "A. baumannii ATCC 19606",
     "CU459141": "A. baumannii AYE",
     "CU468230": "A. baumannii SDF",
     "CP141284": "A. junii H1",
     "CP131470": "A. junii SC22",
     "CP002080": "A. oleivorans DR1",
     "CP002177": "A. pittii PHEA-2",
     "CP041365": "A. tandoii SE63"
 }

 def get_group(label):
     for group in group_colors:
         if label.startswith(group):
             return group
     return "Other"

 def colorize_node(node):
     if node.is_leaf():
         label = leaf_name_map.get(node.name, node.name)
         node.name = label
         group = get_group(label)
         color = group_colors.get(group, "black")
         face = TextFace(label, fsize=8, fgcolor=color)
         face.margin_left = 5  # <-- add margin to move label right from the point
         faces.add_face_to_node(face, node, column=0)

         nstyle = NodeStyle()
         nstyle["fgcolor"] = color
         nstyle["shape"] = "circle"
         node.set_style(nstyle)

 def render_tree(tree, filename, title="", circular=False):
     ts = TreeStyle()
     ts.mode = "c" if circular else "r"
     ts.show_leaf_name = False
     ts.title.add_face(TextFace(title, fsize=12, bold=True), column=0)
     ts.layout_fn = colorize_node
     ts.show_scale = True
     ts.branch_vertical_margin = 0

     # Customize scale bar:
     #ts.scale_len = 0.05  # Scale bar length in tree units
     #ts.scale_format = "%.2f"  # Format scale label to 2 decimals

     # Add a TextFace to show the scale you want
     #scale_face = TextFace("Scale: 0.05 substitutions/site", fsize=8)
     #ts.title.add_face(scale_face, column=1)

     tree.render(filename, w=1200, h=1200 if circular else 800, tree_style=ts)

 def main():
     if len(sys.argv) != 2:
         print("Usage: python3 label_tree.py

“) sys.exit(1) newick_file = sys.argv[1] try: with open(newick_file, “r”) as f: nwk_str = f.read().strip() if not nwk_str.endswith(“;”): nwk_str += “;” t = Tree(nwk_str, format=1) except Exception as e: print(f”[ERROR] Failed to load tree: {e}”) sys.exit(1) # Attempt to reroot at outgroup try: outgroup = t&”CP041365″ if outgroup != t: t.set_outgroup(outgroup) else: print(“[!] Outgroup is root already; skipping reroot.”) except Exception as e: print(f”[!] Warning: Could not reroot tree at baumannii clade: {e}”) #render_tree(t, “tree_colored_bootstrap.png”, title=””, circular=False) render_tree(t, “tree_colored_bootstrap.svg”, title=””, circular=False) print(“✅ Saved: tree_colored_bootstrap.svg”) if __name__ == “__main__”: main()

Workflow for RNA-Binding Protein Enrichment and RNA Type Distribution Analysis (Ute’s Project)

Leave a reply

Download RBP motifs (PWM) from ATtRACT_DB: ATtRACT_db.txt and pwm.txt; Convert to MEME format (if needed) (Under ~/DATA/Data_Ute/RBP_enrichments/)

 # Both generate_named_meme.py and generate_attract_human_meme.py generate the attract_human.meme inkl. Gene_name
 # Note that "grep + generate_named_meme.py" = generate_attract_human_meme.py
 #grep "Homo_sapiens" ATtRACT_db.txt > attract_human.txt  #3256
 #python ~/Scripts/generate_named_meme.py pwm.txt attract_human.txt  #OUTPUT: attract_human_named.meme
 #python ~/Scripts/generate_attract_human_meme.py pwm.txt ATtRACT_db.txt #OUTPUT: attract_human.meme and it is the same to attract_human_named.meme, both are named motifs, in total 1583 MOTIFs, for example "MOTIF 904_HNRNPH2", only for human-reading, not for pipeline

 #cut -f12 attract_human.txt | sort | uniq > valid_ids.txt
 python ~/Scripts/convert_attract_pwm_to_meme.py  #Input is "pwm.txt" #OUTPUT: "attract_human.meme", in total 1583 MOTIFs, for example "MOTIF 904"

Download GENCODE (Under ~/REFs/)

 #Visit and Download: GENCODE FTP site https://www.gencodegenes.org/human/
     * GTF annotation file (e.g., gencode.v48.annotation.gtf.gz)
     * Corresponding genome FASTA (e.g., GRCh38.primary_assembly.genome.fa.gz)
 wget https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_48/gencode.v48.annotation.gtf.gz
 wget https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_48/GRCh38.primary_assembly.genome.fa.gz
 gunzip gencode.v48.annotation.gtf.gz
 gunzip GRCh38.primary_assembly.genome.fa.gz

Get 3UTR.fasta, 5UTR.fasta, CDS.fasta and transcripts.fasta based on GENCODE-files (Under ~/DATA/Data_Ute/RBP_enrichments/)

 grep ",\"protein_coding\"," ~/DATA/Data_Ute/Data_RNA-Seq_MKL-1+WaGa/results_2025_1/degenes/MKL-1_wt.EV_vs_parental-up.txt > MKL-1_wt.EV_vs_parental-up_protein_coding.txt
 grep ",\"protein_coding\"," ~/DATA/Data_Ute/Data_RNA-Seq_MKL-1+WaGa/results_2025_1/degenes/MKL-1_wt.EV_vs_parental-down.txt > MKL-1_wt.EV_vs_parental-down_protein_coding.txt
 grep ",\"protein_coding\"," ~/DATA/Data_Ute/Data_RNA-Seq_MKL-1+WaGa/results_2025_1/degenes/MKL-1_wt.EV_vs_parental-all.txt > MKL-1_wt.EV_vs_parental-all_protein_coding.txt
 grep ",\"protein_coding\"," ~/DATA/Data_Ute/Data_RNA-Seq_MKL-1+WaGa/results_2025_1/degenes/WaGa_wt.EV_vs_parental-up.txt > WaGa_wt.EV_vs_parental-up_protein_coding.txt
 grep ",\"protein_coding\"," ~/DATA/Data_Ute/Data_RNA-Seq_MKL-1+WaGa/results_2025_1/degenes/WaGa_wt.EV_vs_parental-down.txt > WaGa_wt.EV_vs_parental-down_protein_coding.txt
 grep ",\"protein_coding\"," ~/DATA/Data_Ute/Data_RNA-Seq_MKL-1+WaGa/results_2025_1/degenes/WaGa_wt.EV_vs_parental-all.txt > WaGa_wt.EV_vs_parental-all_protein_coding.txt

 #Usage: python extract_transcript_parts.py <gene_list.txt> <gencode.gtf> <genome.fa> <out_prefix>, the script generates <out_prefix>.[transcripts|CDS|5UTR|3UTR].fasta files.
 python ~/Scripts/extract_transcript_parts.py MKL-1_wt.EV_vs_parental-down_protein_coding.txt ~/REFs/gencode.v48.annotation.gtf ~/REFs/GRCh38.primary_assembly.genome.fa MKL-1_down  #112
 python ~/Scripts/extract_transcript_parts.py MKL-1_wt.EV_vs_parental-up_protein_coding.txt ~/REFs/gencode.v48.annotation.gtf ~/REFs/GRCh38.primary_assembly.genome.fa MKL-1_up  #5988
 python ~/Scripts/extract_transcript_parts.py MKL-1_wt.EV_vs_parental-all_protein_coding.txt ~/REFs/gencode.v48.annotation.gtf ~/REFs/GRCh38.primary_assembly.genome.fa MKL-1_background  #19239
 python ~/Scripts/extract_transcript_parts.py WaGa_wt.EV_vs_parental-down_protein_coding.txt ~/REFs/gencode.v48.annotation.gtf ~/REFs/GRCh38.primary_assembly.genome.fa WaGa_down  #93
 python ~/Scripts/extract_transcript_parts.py WaGa_wt.EV_vs_parental-up_protein_coding.txt ~/REFs/gencode.v48.annotation.gtf ~/REFs/GRCh38.primary_assembly.genome.fa WaGa_up  #6538
 python ~/Scripts/extract_transcript_parts.py WaGa_wt.EV_vs_parental-all_protein_coding.txt ~/REFs/gencode.v48.annotation.gtf ~/REFs/GRCh38.primary_assembly.genome.fa WaGa_background  #19239

 python ~/Scripts/filter_short_fasta.py MKL-1_up.3UTR.fasta MKL-1_up.filtered.3UTR.fasta
 python ~/Scripts/filter_short_fasta.py MKL-1_down.3UTR.fasta MKL-1_down.filtered.3UTR.fasta
 python ~/Scripts/filter_short_fasta.py MKL-1_background.3UTR.fasta MKL-1_background.filtered.3UTR.fasta
 # 检查背景文件中有多少序列：
 #grep -c "^>" MKL-1_background.filtered.3UTR.fasta
 #68890
 # 检查背景 FIMO 命中的总序列数：
 #cut -f3 fimo_background_MKL-1_background/fimo.tsv | sort | uniq | wc -l
 #67841
 python ~/Scripts/filter_short_fasta.py WaGa_background.3UTR.fasta WaGa_background.filtered.3UTR.fasta
 python ~/Scripts/filter_short_fasta.py WaGa_up.3UTR.fasta WaGa_up.filtered.3UTR.fasta
 python ~/Scripts/filter_short_fasta.py WaGa_down.3UTR.fasta WaGa_down.filtered.3UTR.fasta

FIMO for motif scan on 3UTR sequences

 # Generate fimo_foreground_MKL-1_down/fimo.tsv
 mamba activate meme_env
 fimo --thresh 1e-4 --oc fimo_foreground_MKL-1_down attract_human_named.meme MKL-1_down.3UTR.fasta
 fimo --thresh 1e-4 --oc fimo_foreground_MKL-1_up attract_human.meme MKL-1_up.3UTR.fasta
 fimo --thresh 1e-4 --oc fimo_background_MKL-1_background attract_human.meme MKL-1_background.3UTR.fasta
 fimo --thresh 1e-4 --oc fimo_foreground_WaGa_down attract_human.meme WaGa_down.3UTR.fasta
 fimo --thresh 1e-4 --oc fimo_foreground_WaGa_up attract_human.meme WaGa_up.3UTR.fasta
 fimo --thresh 1e-4 --oc fimo_background_WaGa_background attract_human.meme WaGa_background.3UTR.fasta

 #Keep only one match per gene (based on Ensembl Gene ID like ENSG00000134871) for each RBP motif, even if multiple transcripts have hits.
 #python ~/Scripts/filter_fimo_best_per_gene.py --input fimo_foreground/fimo.tsv --output fimo_foreground/fimo.filtered.tsv

 python ~/Scripts/convert_gtf_to_Gene_annotation_TSV_file.py ~/REFs/Homo_sapiens.GRCh38.114.gtf Homo_sapiens.GRCh38.gene_annotation.tsv

 python ~/Scripts/filter_fimo_best_per_gene_annotated.py \
 --input fimo_foreground_MKL-1_down/fimo.tsv \
 --annot Homo_sapiens.GRCh38.gene_annotation.tsv \
 --output_filtered fimo_foreground_MKL-1_down/fimo.filtered.tsv \
 --output_annotated fimo_foreground_MKL-1_down/fimo.filtered.annotated.tsv
 #21559
 python ~/Scripts/filter_fimo_best_per_gene_annotated.py \
 --input fimo_foreground_MKL-1_up/fimo.tsv \
 --annot Homo_sapiens.GRCh38.gene_annotation.tsv \
 --output_filtered fimo_foreground_MKL-1_up/fimo.filtered.tsv \
 --output_annotated fimo_foreground_MKL-1_up/fimo.filtered.annotated.tsv
 #(736661 rows)
 python ~/Scripts/filter_fimo_best_per_gene_annotated.py \
 --input fimo_background_MKL-1_background/fimo.tsv \
 --annot Homo_sapiens.GRCh38.gene_annotation.tsv \
 --output_filtered fimo_background_MKL-1_background/fimo.filtered.tsv \
 --output_annotated fimo_background_MKL-1_background/fimo.filtered.annotated.tsv
 #(1869075 rows)
 python ~/Scripts/filter_fimo_best_per_gene_annotated.py \
 --input fimo_foreground_WaGa_down/fimo.tsv \
 --annot Homo_sapiens.GRCh38.gene_annotation.tsv \
 --output_filtered fimo_foreground_WaGa_down/fimo.filtered.tsv \
 --output_annotated fimo_foreground_WaGa_down/fimo.filtered.annotated.tsv
 #(20364 rows)
 python ~/Scripts/filter_fimo_best_per_gene_annotated.py \
 --input fimo_foreground_WaGa_up/fimo.tsv \
 --annot Homo_sapiens.GRCh38.gene_annotation.tsv \
 --output_filtered fimo_foreground_WaGa_up/fimo.filtered.tsv \
 --output_annotated fimo_foreground_WaGa_up/fimo.filtered.annotated.tsv
 #(805634 rows)
 python ~/Scripts/filter_fimo_best_per_gene_annotated.py \
 --input fimo_background_WaGa_background/fimo.tsv \
 --annot Homo_sapiens.GRCh38.gene_annotation.tsv \
 --output_filtered fimo_background_WaGa_background/fimo.filtered.tsv \
 --output_annotated fimo_background_WaGa_background/fimo.filtered.annotated.tsv
 #(1811615 rows)

 python ~/Scripts/run_enrichment.py \
     --attract ATtRACT_db.txt \
     --fimo_fg fimo_foreground_MKL-1_up/fimo.filtered.tsv \
     --fimo_bg fimo_background_MKL-1_background/fimo.filtered.tsv \
     --output rbp_enrichment_MKL-1_up.csv \
     --strategy inclusive
 python ~/Scripts/run_enrichment.py \
     --attract ATtRACT_db.txt \
     --fimo_fg fimo_foreground_MKL-1_down/fimo.filtered.tsv \
     --fimo_bg fimo_background_MKL-1_background/fimo.filtered.tsv \
     --output rbp_enrichment_MKL-1_down.csv
 python ~/Scripts/run_enrichment.py \
     --attract ATtRACT_db.txt \
     --fimo_fg fimo_foreground_WaGa_up/fimo.filtered.tsv \
     --fimo_bg fimo_background_WaGa_background/fimo.filtered.tsv \
     --output rbp_enrichment_WaGa_up.csv
 python ~/Scripts/run_enrichment.py \
     --attract ATtRACT_db.txt \
     --fimo_fg fimo_foreground_WaGa_down/fimo.filtered.tsv \
     --fimo_bg fimo_background_WaGa_background/fimo.filtered.tsv \
     --output rbp_enrichment_WaGa_down.csv

 #Column Meaning in rbp_enrichment_*.[csv|xlsx]
 #a  Number of unique foreground UTRs hit by the RBP
 #b  Number of unique background UTRs hit by the RBP
 #c  Total number of foreground UTRs
 #d  Total number of background UTRs (⬅️ this is the value you're asking about)
 #p_value, fdr   From Fisher's exact test on enrichment

 # The following pdf-files are not used and sent!
 python ~/Scripts/plot_volcano.py --csv rbp_enrichment_MKL-1_up.csv --output MKL-1_volcano_up.pdf --title "Upregulated MKL-1"
 python ~/Scripts/plot_rbp_heatmap.py \
 --csvs rbp_enrichment_MKL-1_up.csv rbp_enrichment_MKL-1_down.csv \
 --labels Upregulated Downregulated \
 --output MKL-1_rbp_enrichment_heatmap.pdf

 #-- Get all genes the number 1621 refers to --
 #AGO2,1621,5050,5732,12987,1.0,1.0   #MKL-1_up
 #motif_ids are 414 and 399
 grep "^414" fimo.filtered.annotated.tsv > AGO2.txt
 grep "^399" fimo.filtered.annotated.tsv >> AGO2.txt
 cut -d$'\t' -f11 AGO2.txt | sort -u > AGO2_uniq.txt
 wc -l AGO2_uniq.txt
 #1621 AGO2_uniq.txt

miRNAs motif analysis using ATtRACT + FIMO

     #* Extract their sequences
     #* Generate a background set
     #* Run RBP enrichment (e.g., with RBPmap or FIMO)
     #* Get p-adjusted enrichment stats (e.g., Fisher + BH)

     #Input_1. DE results (differential expression file from smallRNA-seq)
         #Input: up- and down-, all-regulated files
         #~/DATA/Data_Ute/Data_Ute_smallRNA_7/summaries_exo7/miRNAs/untreated_vs_parental_cells-up.txt  #66
         #~/DATA/Data_Ute/Data_Ute_smallRNA_7/summaries_exo7/miRNAs/untreated_vs_parental_cells-down.txt  #38
         #~/DATA/Data_Ute/Data_Ute_smallRNA_7/summaries_exo7/miRNAs/untreated_vs_parental_cells-all.txt  #1304
         #Format: 1st column = miRNA ID (e.g., hsa-miR-21-5p), optionally with other stats.

         cp ~/DATA/Data_Ute/Data_Ute_smallRNA_7/summaries_exo7/miRNAs/untreated_vs_parental_cells-*.txt .
         #"hsa-miR-3180|hsa-miR-3180-3p"
         #>hsa-miR-3180 MIMAT0018178 Homo sapiens miR-3180
         #UGGGGCGGAGCUUCCGGAG
         #>hsa-miR-3180-3p MIMAT0015058 Homo sapiens miR-3180-3p
         #UGGGGCGGAGCUUCCGGAGGCC

     #Input_2. Reference FASTA (Reference sequences from miRBase or GENCODE)
         #From miRBase: https://mirbase.org/download/  https://mirbase.org/download/CURRENT/
         ##miRBase_v21
         #mature.fa.gz → contains mature miRNA sequences
         #hairpin.fa.gz → for pre-miRNAs
         mv ../RBP_enrichments_OLD_DEL/mature_v21.fa .

     # -- Extract Sequences + Background Set --

     #Inputs:
     #    * up_miRNA.txt and down_miRNA.txt: DE results (first column = miRNA name, e.g., hsa-miR-21-5p)
     #    * mature.fa or hairpin.fa from miRBase

     #Outputs:
     #    * mirna_up.fa
     #    * mirna_down.fa
     #    * mirna_background.fa

     #Use all remaining miRNAs as background:
     python ~/Scripts/prepare_miRNA_sets.py untreated_vs_parental_cells-up.txt untreated_vs_parental_cells-down.txt mature_v21.fa mirna --full-background
     mv mirna_background.fa mirna_full-background.fa

     #Use random subset background. Note that the generated background has the number of maxsize(up, down), in the case is up (84 records):
     #python ~/Scripts/prepare_miRNA_sets.py untreated_vs_parental_cells-up.txt untreated_vs_parental_cells-down.txt mature_v21.fa mirna
     # grep ">" mature_v21.fa | wc -l  #35828
     # grep ">" mirna_full-background.fa | wc -l  #35710-->35723
     # grep ">" mirna_up.fa | wc -l  #84
     # grep ">" mirna_down.fa | wc -l  #34
     # grep ">" mirna_background.fa | wc -l  #84-->67
     # #35,710 + 84 + 34 = 35,828

     fimo --thresh 1e-4 --oc fimo_mirna_down attract_human.meme mirna_down.fa
     fimo --thresh 1e-4 --oc fimo_mirna_up attract_human.meme mirna_up.fa
     fimo --thresh 1e-4 --oc fimo_mirna_full-background attract_human.meme mirna_full-background.fa
     #fimo --thresh 1e-4 --oc fimo_mirna_background attract_human.meme mirna_background.fa

     python ~/Scripts/filter_fimo_best_per_gene_annotated.py \
     --input fimo_mirna_down/fimo.tsv \
     --annot Homo_sapiens.GRCh38.gene_annotation.tsv \
     --output_filtered fimo_mirna_down/fimo.filtered.tsv \
     --output_annotated fimo_mirna_down/fimo.filtered.annotated.tsv  #21
     python ~/Scripts/filter_fimo_best_per_gene_annotated.py \
     --input fimo_mirna_up/fimo.tsv \
     --annot Homo_sapiens.GRCh38.gene_annotation.tsv \
     --output_filtered fimo_mirna_up/fimo.filtered.tsv \
     --output_annotated fimo_mirna_up/fimo.filtered.annotated.tsv  #48
     python ~/Scripts/filter_fimo_best_per_gene_annotated.py \
     --input fimo_mirna_full-background/fimo.tsv \
     --annot Homo_sapiens.GRCh38.gene_annotation.tsv \
     --output_filtered fimo_mirna_full-background/fimo.filtered.tsv \
     --output_annotated fimo_mirna_full-background/fimo.filtered.annotated.tsv  #896
     #python ~/Scripts/filter_fimo_best_per_gene_annotated.py \
     #--input fimo_mirna_background/fimo.tsv \
     #--annot Homo_sapiens.GRCh38.gene_annotation.tsv \
     #--output_filtered fimo_mirna_background/fimo.filtered.tsv \
     #--output_annotated fimo_mirna_background/fimo.filtered.annotated.tsv  #57

     python ~/Scripts/run_enrichment_miRNAs.py \
         --attract ATtRACT_db.txt \
         --fimo_fg fimo_mirna_up/fimo.filtered.tsv \
         --fimo_bg fimo_mirna_full-background/fimo.filtered.tsv \
         --output rbp_enrichment_mirna_up.csv \
         --strategy inclusive
     python ~/Scripts/run_enrichment_miRNAs.py \
         --attract ATtRACT_db.txt \
         --fimo_fg fimo_mirna_down/fimo.filtered.tsv \
         --fimo_bg fimo_mirna_full-background/fimo.filtered.tsv \
         --output rbp_enrichment_mirna_down.csv \
         --strategy inclusive
     #python ~/Scripts/run_enrichment_miRNAs.py \
     #    --attract ATtRACT_db.txt \
     #    --fimo_fg fimo_mirna_up/fimo.filtered.tsv \
     #    --fimo_bg fimo_mirna_background/fimo.filtered.tsv \
     #    --output rbp_enrichment_mirna_up_on_subset-background.csv \
     #    --strategy inclusive
     #python ~/Scripts/run_enrichment_miRNAs.py \
     #    --attract ATtRACT_db.txt \
     #    --fimo_fg fimo_mirna_down/fimo.filtered.tsv \
     #    --fimo_bg fimo_mirna_background/fimo.filtered.tsv \
     #    --output rbp_enrichment_mirna_down_on_subset-background.csv \
     #    --strategy inclusive

     #FXR2   1 (hsa-miR-92b-5p)  1   1   118 0.0168067226890756  0.365546218487395
     #ORB2   1 (hsa-miR-4748)    1   1   118 0.0168067226890756  0.365546218487395

     #-- Get all genes the number 1621 refers to --
     grep "^FXR2" ATtRACT_db.txt
     #motif_ids is M020_0.6
     grep "^M020_0.6" fimo_mirna_up/fimo.filtered.annotated.tsv > FXR2.txt
     grep "^M020_0.6" fimo_mirna_up/fimo.filtered.annotated.tsv
     #cut -d$'\t' -f11 AGO2.txt | sort -u > AGO2_uniq.txt
     #wc -l AGO2_uniq.txt (1621 records)

     grep "^ORB2" ATtRACT_db.txt
     grep "^M120_0.6" fimo_mirna_up/fimo.filtered.annotated.tsv

Reports for RBP enrichment results

 Please find attached the results of the RNA-binding protein (RBP) enrichment analysis using FIMO and the ATtRACT motif database, along with a brief description of the procedures used for both the 3′ UTR-based analysis (RNA-seq) and the miRNA-based analysis (small RNA-seq).

 1. RBP Motif Enrichment from RNA-seq (3′ UTRs)

 We focused on 3′ UTRs, as they are key regulatory regions for RBPs. Sequences shorter than 16 nucleotides were excluded. Using FIMO (from the MEME suite) with motifs from the ATtRACT database, we scanned both foreground and background 3′ UTR sets to identify motif occurrences.

 Foreground: Differentially expressed transcripts (e.g., MKL-1 up/down, WaGa up/down)
 Background: All non-differentially expressed transcripts

 Analysis: Fisher’s exact test was used to assess motif enrichment; p-values were adjusted using the Benjamini–Hochberg method.

 Output files (RNA-seq):

     * rbp_enrichment_MKL-1_down.xlsx / .png
     * rbp_enrichment_MKL-1_up.xlsx / .png
     * rbp_enrichment_WaGa_down.xlsx / .png
     * rbp_enrichment_WaGa_up.xlsx / .png

 2. RBP Motif Enrichment from Small RNA-seq (miRNAs)

 This analysis focused on differentially expressed miRNAs, using either mature miRNA sequences from miRBase. We scanned for RBP binding motifs within these sequences using FIMO and assessed motif enrichment relative to background sets.

 Foreground: DE miRNAs (up/down) from small RNA-seq comparisons
 Background: All other miRNAs from miRBase

 Analysis: FIMO was used with --thresh 1e-4, followed by annotation and filtering. Enrichment was assessed using Fisher’s test + BH correction.

 Output files (miRNAs):

     * rbp_enrichment_mirna_down.xlsx
     * rbp_enrichment_mirna_up.xlsx

 How to Interpret the Numbers
 Each row in the result tables represents one RBP and its enrichment statistics:

 a: foreground genes/sequences with the motif
 b: background genes/sequences with the motif
 c: total number of foreground genes/sequences
 d: total number of background genes/sequences

 These values are used to compute p-values and FDRs.

 For example, in rbp_enrichment_MKL-1_up.xlsx, AGO2 has a = 1621, meaning FIMO detected AGO2 motifs in 1,621 genes in the MKL-1 upregulated set. These genes are listed in AGO2_uniq.txt.

 Similarly, for the miRNA analysis (e.g., rbp_enrichment_mirna_up.xlsx and rbp_enrichment_mirna_down.xlsx), the numbers represent counts of unique miRNAs with at least one significant motif hit. As examples, I calculated the detailed membership for FXR2 and ORB2.

 diff RBP_enrichments/rbp_enrichment_MKL-1_up.csv  RBP_enrichments_OLD_DEL/rbp_enrichment_MKL-1_up.csv
 #diff RBP_enrichments/rbp_enrichment_MKL-1_up.xlsx RBP_enrichments_OLD_DEL/rbp_enrichment_MKL-1_up.xlsx
 #diff RBP_enrichments/rbp_enrichment_MKL-1_up.png   RBP_enrichments_OLD_DEL/rbp_enrichment_MKL-1_up.png
 diff RBP_enrichments/rbp_enrichment_MKL-1_down.csv RBP_enrichments_OLD_DEL/rbp_enrichment_MKL-1_down.csv
 #diff RBP_enrichments/rbp_enrichment_MKL-1_down.xlsx RBP_enrichments_OLD_DEL/rbp_enrichment_MKL-1_down.xlsx
 #diff RBP_enrichments/rbp_enrichment_MKL-1_down.png RBP_enrichments_OLD_DEL/rbp_enrichment_MKL-1_down.png
 diff RBP_enrichments/rbp_enrichment_WaGa_up.csv RBP_enrichments_OLD_DEL/rbp_enrichment_WaGa_up.csv
 #diff RBP_enrichments/rbp_enrichment_WaGa_up.xlsx RBP_enrichments_OLD_DEL/rbp_enrichment_WaGa_up.xlsx
 #diff RBP_enrichments/rbp_enrichment_WaGa_up.png RBP_enrichments_OLD_DEL/rbp_enrichment_WaGa_up.png
 diff RBP_enrichments/rbp_enrichment_WaGa_down.csv RBP_enrichments_OLD_DEL/rbp_enrichment_WaGa_down.csv
 #diff RBP_enrichments/rbp_enrichment_WaGa_down.xlsx RBP_enrichments_OLD_DEL/rbp_enrichment_WaGa_down.xlsx
 #diff RBP_enrichments/rbp_enrichment_WaGa_down.png RBP_enrichments_OLD_DEL/rbp_enrichment_WaGa_down.png
 okular RBP_enrichments/MKL-1_volcano_up.pdf
 okular RBP_enrichments_OLD_DEL/MKL-1_volcano_up.pdf
 okular RBP_enrichments/MKL-1_rbp_enrichment_heatmap.pdf
 okular RBP_enrichments_OLD_DEL/MKL-1_rbp_enrichment_heatmap.pdf

 diff RBP_enrichments/rbp_enrichment_mirna_up.csv RBP_enrichments_OLD_DEL/rbp_enrichment_mirna_up_on_full-background.csv
 #diff RBP_enrichments/rbp_enrichment_mirna_up.xlsx RBP_enrichments_OLD_DEL/rbp_enrichment_mirna_up.xlsx
 #diff RBP_enrichments/rbp_enrichment_mirna_up.png RBP_enrichments_OLD_DEL/rbp_enrichment_mirna_up_on_full-background.png
 diff RBP_enrichments/rbp_enrichment_mirna_down.csv RBP_enrichments_OLD_DEL/rbp_enrichment_mirna_down_on_full-background.csv
 #diff RBP_enrichments/rbp_enrichment_mirna_down.xlsx RBP_enrichments_OLD_DEL/rbp_enrichment_mirna_down.xlsx
 #diff RBP_enrichments/rbp_enrichment_mirna_down.png RBP_enrichments_OLD_DEL/rbp_enrichment_mirna_down_on_full-background.png

Generate the sequence logos for the enriched RBP motives

 import os
 import pandas as pd
 import matplotlib.pyplot as plt
 import logomaker
 from pathlib import Path
 import re

 # --------------
 # Config
 # --------------
 motif_table_path = "ATtRACT_db.txt"
 pwm_file_path = "pwm.txt"
 input_files = [
     "rbp_enrichment_MKL-1_up.csv",
     "rbp_enrichment_MKL-1_down.csv",
     "rbp_enrichment_WaGa_up.csv",
     "rbp_enrichment_WaGa_down.csv",
     "rbp_enrichment_mirna_up.csv",
     "rbp_enrichment_mirna_down.csv"
 ]
 output_dir = "sequence_logos"
 os.makedirs(output_dir, exist_ok=True)

 # --------------
 # Helper Functions
 # --------------
 def load_pwm_file(pwm_path):
     pwm_dict = {}
     current_id = None
     pwm_matrix = []
     with open(pwm_path, 'r') as f:
         for line in f:
             line = line.strip()
             if not line:
                 continue
             if line.startswith('>'):
                 if current_id and pwm_matrix:
                     pwm_dict[current_id] = pwm_matrix
                 parts = line[1:].split()
                 current_id = parts[0]
                 pwm_matrix = []
             else:
                 row = [float(x) for x in line.split()]
                 if len(row) == 4:
                     pwm_matrix.append(row)
     if current_id and pwm_matrix:
         pwm_dict[current_id] = pwm_matrix
     return pwm_dict

 def sanitize_filename(text):
     return re.sub(r'[^\w\-_.]', '_', text)

 def plot_logo(pwm_df, title, output_path):
     fig, ax = plt.subplots(figsize=(len(pwm_df)*0.5, 2))
     logo = logomaker.Logo(pwm_df, ax=ax, color_scheme='classic')
     logo.style_spines(visible=False)
     logo.style_spines(spines=['left', 'bottom'], visible=True)
     ax.set_ylabel("Information")
     ax.set_title(title)
     fig.tight_layout()
     fig.savefig(output_path, bbox_inches='tight')
     plt.close(fig)

 # --------------
 # Load Motif Table and Filter per your strategy
 # --------------
 motif_table = pd.read_csv(motif_table_path, sep='\t')

 # Filter rows where Score ends with '**'
 motif_table = motif_table[motif_table['Score'].str.endswith('**')].copy()

 # Remove '**' and convert Score to float
 motif_table['Score'] = motif_table['Score'].str.replace(r'\*\*$', '', regex=True).astype(float)

 # Function to keep Homo sapiens rows if exist, otherwise all
 def keep_human_if_exists(group):
     human_rows = group[group['Organism'] == 'Homo_sapiens']
     return human_rows if not human_rows.empty else group

 motif_table = motif_table.groupby('Gene_name', group_keys=False).apply(keep_human_if_exists)

 # If multiple remain per RBP, pick one randomly (seeded for reproducibility)
 motif_table = motif_table.groupby('Gene_name').apply(lambda g: g.sample(1, random_state=42)).reset_index(drop=True)

 # Build motif map for quick lookup
 motif_map = motif_table[['Gene_name', 'Matrix_id']].drop_duplicates()

 # Load PWM dictionary
 pwm_dict = load_pwm_file(pwm_file_path)

 # --------------
 # Process Each Enrichment File
 # --------------
 for file in input_files:
     print(f"\n📂 Processing {file}")
     df = pd.read_csv(file)

     if 'fdr' not in df.columns:
         print(f"⚠️ Skipping {file}: no 'fdr' column")
         continue

     sig_df = df[df['fdr'] <= 0.05].copy()

     for _, row in sig_df.iterrows():
         rbp_name = row['RBP']
         matches = motif_map[motif_map['Gene_name'] == rbp_name]

         if matches.empty:
             print(f"  ⚠️ No motif entry for RBP: {rbp_name}")
             continue

         # Should be exactly one row per RBP now
         motif_row = matches.iloc[0]
         matrix_id = motif_row['Matrix_id']

         if matrix_id not in pwm_dict:
             print(f"  ⚠️ PWM not found for matrix ID {matrix_id} (RBP: {rbp_name})")
             continue

         pwm = pwm_dict[matrix_id]
         pwm_df = pd.DataFrame(pwm, columns=list("ACGT"))

         title = f"{rbp_name} ({matrix_id})"
         safe_name = sanitize_filename(f"{Path(file).stem}_{rbp_name}_{matrix_id}.png")
         out_path = os.path.join(output_dir, safe_name)

         plot_logo(pwm_df, title, out_path)

 print("\n✅ Sequence logo generation complete.")

Plot pie-chart for RNA-seq results

 mamba activate plot-numpy1
 (plot-numpy1) jhuang@WS-2290C:/media/jhuang/Elements1/Data_Ute$ python rna_type_piecharts.py /mnt/nvme1n1p1/Homo_sapiens/Ensembl/GRCh38/Annotation/Genes/genes.gtf Data_RNA-Seq_MKL-1+WaGa/merged_gene_counts_MKL-1_human.txt Data_RNA-Seq_MKL-1+WaGa/merged_gene_counts_WaGa_human.txt

Code of rna_type_piecharts.py

 import sys
 import pandas as pd
 import matplotlib.pyplot as plt
 from matplotlib import cm
 import numpy as np

 def parse_gtf(gtf_file):
     biotype_dict = {}
     with open(gtf_file) as f:
         for line in f:
             if line.startswith('#'):
                 continue
             fields = line.strip().split('\t')
             if fields[2] != 'gene':
                 continue
             attr_field = fields[8]
             attrs = {}
             for attr in attr_field.split(';'):
                 attr = attr.strip()
                 if attr == '':
                     continue
                 key, val = attr.split(' ', 1)
                 attrs[key] = val.strip('"')
             gene_id = attrs.get('gene_id')
             gene_biotype = attrs.get('gene_biotype') or attrs.get('gene_type')  # some GTFs use gene_type
             if gene_id and gene_biotype:
                 biotype_dict[gene_id] = gene_biotype
     return biotype_dict

 def load_counts(mkl_file, waga_file):
     df_mkl = pd.read_csv(mkl_file, sep='\t')
     df_waga = pd.read_csv(waga_file, sep='\t')
     # Check for Geneid/gene_name match
     if not (df_mkl['Geneid'].equals(df_waga['Geneid']) and df_mkl['gene_name'].equals(df_waga['gene_name'])):
         mismatch = df_mkl.loc[
             (df_mkl['Geneid'] != df_waga['Geneid']) | (df_mkl['gene_name'] != df_waga['gene_name']),
             ['Geneid', 'gene_name']
         ]
         print("⚠️ Mismatched rows found between Geneid/gene_name columns:")
         print(mismatch)
         raise ValueError("Mismatch in Geneid/gene_name columns between MKL-1 and WaGa count files.")

     # Drop duplicated Geneid/gene_name from WaGa counts before merge
     df_waga_samples = df_waga.drop(columns=['Geneid', 'gene_name'])

     # Merge horizontally
     df = pd.concat([df_mkl, df_waga_samples], axis=1)
     return df, df_mkl, df_waga

 def map_biotypes(df, biotype_dict):
     # Map gene biotypes by Geneid; if not found, assign 'unknown'
     df['gene_biotype'] = df['Geneid'].map(biotype_dict).fillna('unknown')
     return df

 def save_raw_data_for_pie(df, output_excel):
     # Sum counts by biotype
     # Sum all sample columns (exclude Geneid, gene_name, gene_biotype)
     sample_cols = [c for c in df.columns if c not in ['Geneid', 'gene_name', 'gene_biotype']]
     df['total_counts'] = df[sample_cols].sum(axis=1)

     # Aggregate counts per biotype
     pie_data = df.groupby('gene_biotype')['total_counts'].sum().reset_index()

     with pd.ExcelWriter(output_excel) as writer:
         df.to_excel(writer, sheet_name='All_gene_counts', index=False)
         pie_data.to_excel(writer, sheet_name='Pie_data', index=False)

     return pie_data

 def plot_pie(pie_data, output_png):
     pie_data = pie_data[pie_data['total_counts'] > 0].sort_values('total_counts', ascending=False)

     N = 10  # Top N biotypes
     if len(pie_data) > N:
         top_data = pie_data.nlargest(N, 'total_counts')
         others = pd.DataFrame([{
             'gene_biotype': 'Other',
             'total_counts': pie_data['total_counts'].sum() - top_data['total_counts'].sum()
         }])
         pie_data = pd.concat([top_data, others], ignore_index=True)

     sizes = pie_data['total_counts']
     labels = pie_data['gene_biotype']

     # Use updated color map syntax
     cmap = plt.colormaps['Pastel1']
     colors = [cmap(i % cmap.N) for i in range(len(labels))]

     fig, ax = plt.subplots(figsize=(10, 10))

     # Draw pie chart without labels/autopct
     wedges, _ = ax.pie(
         sizes,
         startangle=90,
         colors=colors,
         radius=1.2
     )

     total = sum(sizes)

     for i, wedge in enumerate(wedges):
         angle = (wedge.theta2 + wedge.theta1) / 2.
         x = np.cos(np.deg2rad(angle))
         y = np.sin(np.deg2rad(angle))

         pct = sizes.iloc[i] / total * 100
         if pct < 1:
             continue  # Skip small slices

         ha = 'left' if x > 0 else 'right'
         ax.annotate(
             f"{labels.iloc[i]} ({pct:.1f}%)",
             xy=(x, y),
             xytext=(1.4 * x, 1.4 * y),
             ha=ha, va='center',
             fontsize=9,
             #arrowprops=dict(arrowstyle='-', color='gray')
             arrowprops=dict(arrowstyle='-', connectionstyle='angle3,angleA=0,angleB=90', color='gray')
         )

     # Build detailed legend: biotype (count, %)
     legend_labels = [
         f"{lbl} ({int(cnt):,}, {cnt / total:.1%})"
         for lbl, cnt in zip(labels, sizes)
     ]

     ax.legend(
         wedges,
         legend_labels,
         title="Gene Biotype",
         loc="center left",
         bbox_to_anchor=(1, 0.5),
         fontsize=9,
         title_fontsize=10
     )

     ax.set_title("", fontsize=14)
     ax.axis('equal')
     plt.tight_layout()
     plt.savefig(output_png, bbox_inches='tight')
     plt.close()

 def main():
     if len(sys.argv) != 4:
         print("Usage: python rna_type_piecharts.py <genes.gtf> <MKL-1_counts.txt> <WaGa_counts.txt>")
         sys.exit(1)

     gtf_file = sys.argv[1]
     mkl_counts_file = sys.argv[2]
     waga_counts_file = sys.argv[3]

     print("🔍 Parsing gene biotypes from GTF...")
     biotype_dict = parse_gtf(gtf_file)

     print("📊 Loading and merging count matrices...")
     df, df_mkl, df_waga = load_counts(mkl_counts_file, waga_counts_file)

     print("🧬 Mapping gene biotypes...")
     df = map_biotypes(df, biotype_dict)
     df_mkl = map_biotypes(df_mkl, biotype_dict)
     df_waga = map_biotypes(df_waga, biotype_dict)

     print("💾 Saving raw data for pie charts to Excel...")
     pie_data = save_raw_data_for_pie(df, 'rna_biotype_pie_data.xlsx')
     pie_mkl_data = save_raw_data_for_pie(df_mkl, 'rna_biotype_pie_data_MKL-1.xlsx')
     pie_waga_data = save_raw_data_for_pie(df_waga, 'rna_biotype_pie_data_WaGa.xlsx')

     print("📈 Plotting pie chart...")
     plot_pie(pie_data, 'rna_biotype_pie_chart.png')
     plot_pie(pie_mkl_data, 'rna_biotype_pie_chart_MKL-1.png')
     plot_pie(pie_waga_data, 'rna_biotype_pie_chart_WaGa.png')

     print("✅ Done! Excel data saved to 'rna_biotype_pie*_data.xlsx' and pie chart saved to 'rna_biotype_pie*_chart.png'.")

 if __name__ == "__main__":
     main()

审稿人

Leave a reply

审稿人 #1：

在题为《对一种药物敏感的鲍曼不动杆菌进行基因组和表型特征分析揭示其毒力相关性状和耐受应激能力增强》的稿件中，Foong 等人通过表型和基因组实验对药物敏感型鲍曼不动杆菌进行了系统分析。本研究为探索鲍曼不动杆菌菌株的新型毒力因子提供了重要信息。该稿件在经过重大修改后可考虑接收。具体问题如下：

应进行 HKAB-1 与基因组数据库中其他代表性鲍曼不动杆菌菌株的比较基因组分析，以揭示其进化关系。

如表 1 所示，HKAB-1 对头孢菌素、青霉素及其他一些抗生素具有耐药性，是否将其定义为“药物敏感型鲍曼不动杆菌”是恰当的，值得商榷。

图 1 显示 HKAB-1 的生长速率高于标准菌株，这种现象背后的可能机制是什么？应对此进行详细讨论。

应分析 HKAB-1 中与生物膜形成和群体运动相关的基因簇，并讨论其功能。

应通过溶血实验、蛋白酶活性实验，甚至小鼠模型对 HKAB-1 的毒力进行进一步评估。

审稿人 #2：

该文章描述了一株临床分离的鲍曼不动杆菌（HKAB-1）的表型和基因组特征。尽管该菌株对多种抗生素表现出敏感性，却显示出异常强的毒力相关性状。该研究旨在通过将 HKAB-1 与参考菌株 ATCC19606 进行比较，探索抗菌药物敏感性与致病潜能之间的矛盾关系。通过生长实验、生物膜形成、运动性测试、干燥耐受实验以及全基因组测序，作者揭示了非耐药型鲍曼不动杆菌中毒力机制的重要见解。

在我看来，这篇文章写得很好，选题也非常具有现实意义。鲍曼不动杆菌因其多药耐药性，在医院获得性感染中被广泛认为是最重要的病原体之一。该研究设计合理，采用了多种表型和基因组学方法。数据一致性好、展示清晰，并配有适当的统计分析。然而，在发表之前，作者应解决以下问题：

引言中缺乏清晰的研究假设。虽然该假设在摘要和讨论部分略有提及，但在引言中并未明确或正式提出。建议在引言末尾加一句类似“本研究旨在……”来明确研究目标。
关于抗药性与毒力之间可能存在的权衡假说，虽然该观点新颖且重要，但文章并未提供直接的分子机制证据来支持该结论。尽管表型实验（如增强的生物膜形成、运动性和干燥耐受性）有据可依，且基因组分析揭示了耐药和毒力基因的存在，但文章并未对这些基因的表达或调控机制（如RNA表达分析或蛋白质组学）进行功能验证。因此，建议将相关表述重新措辞，例如使用“我们的发现提示存在潜在的权衡关系”或“观察到的表型可能表明……”，并在讨论中明确指出还需进一步研究其分子基础。
基因型并不总能直接转化为表型。文中多次将某些基因（如 adeB）的存在作为耐药性或毒力机制活性的证据。但若无功能验证（如基因表达或蛋白活性实验），此种解释可能言过其实。建议使用更为谨慎的表达方式，例如“潜在活性”或“推测功能性”等术语，以反映仅凭基因存在并不能确认其在功能上表达。
文章的一个主要不足是缺乏患者临床信息。除了提到“左心衰”外，未提供任何与患者相关的感染临床背景。这削弱了研究的临床相关性和转化价值。尽管该研究是回顾性的，如果能简要说明感染过程、住院时间、治疗方案及结局，将大大提升论文的临床意义。若有相关信息，强烈建议作者补充。
缺乏 RNA-seq 或 RT-qPCR 数据来支持有关毒力和耐药基因（如 ade、bap、hemO）表达的结论。若能补充功能验证数据，将更有力地连接基因型与表型。建议在讨论部分简要指出这一研究局限性。

审稿人 #3：

Bekere 等人的手稿评审

总结

Bekere 等人的手稿研究了肠出血性耶尔森菌（Yersinia enterocolitica）Yop效应蛋白在抵抗人类原代巨噬细胞免疫反应中的作用。结果具有一定意义，因为需要更多关于人类巨噬细胞的研究以便与现有大量小鼠巨噬细胞研究数据进行比较。此外，作者采用了多种方法评估反应，包括基因表达、组蛋白磷酸化、炎症小体激活和钙信号传导。虽然没有报道真正新颖的结果，但研究深化了我们对耶尔森菌-巨噬细胞相互作用的理解。

本文存在三大主要缺陷：首先，作者未讨论并引用Brodsky实验室近期关于感染人细胞的耶尔森菌数据；其次，作者未提及所用的原代人类巨噬细胞为未经激活状态，因此不表达pyrin，YopM在炎症小体抑制中无作用；第三，不同实验中使用的感染复数（MOI）不一致，有些炎症小体分析使用了MOI 500，显然不符合生理条件。

以下将详细说明这些问题，并附带一些小的建议，帮助作者改进稿件。

主要评论

第109-111行：“所有关于耶尔森菌对炎症小体活性影响的研究均在小鼠巨噬细胞或小鼠感染模型中完成（Bliska 等，2013；Schubert 等，2020）。”请修订稿件，纳入Brodsky实验室近期发表的人类细胞数据（PMID 37615436，PMID 39186805）。其中一篇报道YopP诱导人原代巨噬细胞凋亡，这对本研究相关。

第329-358行及图5：关于炎症小体的讨论和数据存在三大问题：

必须考虑上述Brodsky实验室最新数据。还应引用如Zwack等2015年数据显示，murine巨噬细胞中yopK/yopQ突变体诱导的炎症小体激活是由于YopB/D过度转运，引发caspase-11激活及非经典NLRP3途径。

应说明因使用未经激活的人类原代巨噬细胞，无pyrin表达，因此YopM无炎症小体抑制作用。

各实验MOI不统一（图1-5用100，图7用50），图6中某些炎症小体实验MOI为500，时间为2小时，明显不符合生理。

次要评论

摘要：“suggesting a higher-level regulatory mechanism”表述不明确。

第73-75行：“原代人类巨噬细胞很好地反映了体内对应免疫细胞功能，Yersinia感染后20小时才出现细胞死亡迹象。”请参见文献39186805并重新表述。

第92行：“NOD-like receptors (NLRPs)”应修正。

第100-101行：“Schoberle 等2016”引用不正确。

图6C：不明白为何数据归一化到pT3SS。

审稿人 #4：

题为“Primary Human Macrophages的免疫激活被Yersinia效应蛋白协调抑制”的论文中，作者研究了肠出血性耶尔森菌T3SS效应蛋白对宿主的影响。该论文包含了对高毒力、低毒力及突变株的RNA测序数据，数据丰富。然而，与作者2021年发表的类似RNA测序数据相比，本文的主要区别尚不清晰。

以下为需作者回复的几点：

本文所用数据集是否与Bekere等2021年发表的数据不同？如果是，作者能否解释为何生成新数据集？若为相同数据，能否说明新发现？
所有RNA测序数据均未配合菌体数量信息。例如图1中1.5小时及6小时感染时间点，未说明WAC和WA314菌株是否等量侵入巨噬细胞。若菌数不同，将影响宿主反应，独立于毒力因子。该问题同样适用于所有T3SS突变株比较。菌体负荷信息对理解细菌毒力因子作用至关重要。
能否提供图5A免疫印迹定量结果？且图5B缺乏统计学分析，导致误差条较大时数据难以解释。
H3S10ph既是有丝分裂标记，也反映间期的转录状态。作者能否提供感染对细胞周期影响的信息？H3S10ph差异或因不同突变株对细胞周期调控不同。如何区分有丝分裂细胞中大量的H3S10ph与间期细胞中特异性定位的H3S10ph？此点重要，因为有丝分裂H3S10ph覆盖全基因组，而间期则有特异性位点。
第324行中作者称：“YopP抑制基因转录及其余Yop协同作用与组蛋白H3S10ph抑制相关，可能在全基因组水平抑制基因转录。”作者是否尝试将me3/H3S10峰与差异表达基因重叠，判断支持该假说？令人惊讶的是，作者未对数据集进行此类深入挖掘，未给出上调基因与me3/H3S10峰重叠百分比及下调基因的对应数据。

复杂的人类长毛皮肤类器官作为单纯疱疹病毒1型（HSV-1）皮肤感染的模型

Leave a reply

摘要

对于单纯疱疹病毒1型（HSV-1）而言，皮肤是其初始感染的主要部位。在初次裂解性感染后，病毒进入外周神经系统并建立潜伏感染状态。从潜伏感染的神经元中自发再激活，会导致典型的HSV-1相关疾病，如唇疱疹。由于皮肤中包含多种不同类型的细胞和结构，以及人类特异性的感染反应，因此建立HSV-1诱导的皮肤病理模型具有挑战性。尽管如此，使用单层细胞系、类表皮培养物、离体皮肤和小鼠模型的研究，已经极大地推动了我们对HSV-1皮肤感染机制的理解。

然而，许多皮肤特异性结构，特别是毛囊，在初始感染和病毒再激活中的作用仍不明确。本研究中，我们使用由诱导性多能干细胞（iPSC）衍生的人类复杂长毛皮肤类器官作为HSV-1感染的模型。我们应用显微成像、总体和空间转录组学（具有单细胞分辨率）来深入研究特定细胞类型中的病毒生命周期及宿主反应。

我们发现病毒感染主要限于表皮中的角质形成细胞和毛囊中的特定细胞类型。此外，我们观察到细胞类型特异性的干扰素刺激基因和TNF通路的激活。我们还追踪了组织内的旁分泌信号传导，发现TNF反应基因在邻近细胞中被上调。

综上所述，皮肤类器官与新型空间转录组技术的结合，为HSV-1在皮肤中的感染提供了一个生理上高度相关的模型系统。

引言

单纯疱疹病毒1型（Herpes Simplex Virus 1，HSV-1）是一种高度流行的人类病原体，在50岁以下人群中的全球感染率约为67%¹。原发感染通常通过黏膜表面或皮肤微小损伤发生。HSV-1最初在皮肤和黏膜的上皮细胞中进行有效复制，导致炎症和组织损伤，最终形成水疱²。在上皮细胞复制之后，HSV-1进入外周神经元的神经末梢，并沿轴突向上传输至三叉神经节中的神经细胞体，在那里建立潜伏感染状态³⁻⁴。病毒可在没有明显诱因的情况下自发再激活，重新进入裂解性感染周期，产生具有感染性的病毒颗粒，通过轴突以前向运输的方式再次进入皮肤和黏膜的上皮细胞。随后在上皮中的裂解性感染过程导致典型的疱疹病变，并释放病毒到外界环境中²。

过去，研究人员采用了多种模型系统来阐明HSV-1在皮肤中的原发感染及再激活机制。由于皮肤结构的复杂性以及多种不同细胞类型的参与，传统的单层细胞培养体外模型只能模拟病毒生命周期的某些方面。为了更复杂地研究病毒的易感性、传播方式和细胞病变效应，研究者采用了动物模型⁵⁻⁸、离体小鼠皮肤⁸⁻¹⁰、人体皮肤切片⁸,¹¹,¹²以及类表皮培养系统¹³⁻¹⁴等模型。

尽管这些模型大大加深了我们对HSV-1在皮肤中致病机制的理解，但仍有许多方面尚不明确。尤其是毛囊等特定皮肤结构的易感性、不同成纤维细胞亚群的作用，以及某些低丰度细胞类型在原发感染、潜伏建立和病毒再激活过程中的贡献，仍未被完全阐明。类器官研究领域正在不断发展，为病毒研究提供了潜在的新模型系统¹⁵⁻¹⁷。

本研究中，我们使用了一种由人诱导性多能干细胞（hiPSCs）构建的高度复杂的人类皮肤类器官（SkO）模型，用于分析HSV-1在皮肤中的致病机制。SkO模型按照Lee及其同事的协议生成¹⁸⁻¹⁹，包含分层的表皮、富含脂肪的真皮、具有色素的产毛毛囊、皮脂腺、黑色素细胞和Merkel细胞。毛囊由施旺细胞包裹的感觉神经纤维所支配，神经元细胞体与卫星胶质细胞聚集，形成类似三叉神经节的结构。SkO模型中包含与人类胎儿第二孕期皮肤中相对应的所有细胞类型，唯独缺乏汗腺、血管化结构和免疫细胞¹⁸⁻¹⁹。与许多其他类器官类似，SkO呈现“内外翻”形态：真皮位于外侧并暴露于培养液中，而表皮位于内侧，毛囊中的毛发向类器官内部生长。

在本研究中，我们从SkO的真皮侧接种HSV-1，病毒首先感染真皮成纤维细胞，随后感染角质形成细胞，从而模拟了深层皮肤伤口或病毒再激活所引起的初始感染过程。通过显微成像、总体转录组和空间转录组（具有单细胞分辨率）技术，我们揭示了HSV-1在组织中的传播路径，并分析了空间和时间维度上的宿主免疫应答。

结果

3.1 皮肤类器官的构建

我们根据 Lee 等人提出的协议构建了皮肤类器官（Skin Organoids, SkOs）¹⁹，并通过明场显微镜观察、免疫组织化学（IHC）以及整体免疫荧光染色（WMS）对其分化情况进行了验证（见图 1 和补充图 1）。结果显示，SkOs 具有预期的头尾结构，其中头部含有真皮和表皮，尾部则含有软骨组织。我们的 SkOs 在培养约 120 天后达到了完整的复杂性。

毛囊展示出预期的结构，例如真皮乳头、基质、内根鞘和外根鞘（见补充图 1）。我们还检测到了 Merkel 细胞，以及毛囊的神经支配情况，并观察到神经元胞体在类器官尾部的聚集（见图 1）。

3.2 SkOs 感染 HSV-1

完全分化的 SkOs（约 120 天龄）被感染了携带 GFP 基因且受 CMV 早期启动子控制表达的 HSV-1 17 型株。为了观察病毒在组织内的传播，我们仅用 800 PFU 感染了一个 SkO，假设类器官表面约有 4×10³ 个细胞，这对应大约 0.2 的 MOI。感染后 2 天（2 dpi），通过荧光显微镜可以在类器官表面看到 GFP 表达的斑点（图 2A）。到 4 dpi，GFP 荧光增强并扩散至整个类器官。通过 SkO 切片免疫组化（IHC）和整体免疫荧光染色（WMS）观察到，感染最初出现在毛囊从类器官表面突出的部位（图 2B 和补充图 2）。2 dpi 时，真皮中首层成纤维细胞显示 GFP、HSV-1 早期蛋白 ICP0 和晚期蛋白 gD 的表达。4 dpi 时，感染扩展至类器官内部结构，包括表皮，其中基底层和棘层角质形成细胞均检测到 GFP、ICP0 和 gD 的阳性染色。角质层角质形成细胞未见感染迹象。同样，毛囊的各细胞层（除角化的毛发结构外）均为 GFP、ICP0 和 gD 阳性。直到 8 dpi，这些结构以及类器官尾部的软骨仍未见感染迹象。此外，我们观察到感染细胞出现明显的细胞病变效应。真皮中感染的成纤维细胞在 2 dpi 即出现首批病变迹象，4 dpi 时尤为明显。表皮中感染的角质形成细胞和毛囊内的内层细胞在 6 dpi 开始出现细胞病变，导致基底膜和毛囊结构的破坏。

MOI 是“Multiplicity of Infection”的缩写，中文通常称为“感染复数”或“感染乘数”。它表示每个目标细胞平均接收到的病毒颗粒数量。比如：

MOI = 1 表示平均每个细胞接收到1个病毒颗粒。
MOI = 0.2 表示平均每5个细胞中只有1个细胞会被病毒感染。

3.3 HSV-1 感染皮肤类器官的整体转录组分析（Bulk transcriptomics）

在确认 HSV-1 能够有效感染 SkOs（皮肤类器官）后，我们进行了未处理组与 HSV-1 感染组 SkOs（在感染后第 2、4、6 和 8 天）的整体 RNA 测序（bulk RNA-seq），以分析宿主对病毒感染的反应。在感染第 2 天（2 dpi），我们仅检测到 20 个差异表达基因（DEGs），到了第 4 天增加到约 1000 个，第 6 天和第 8 天则达到了约 5000 个（图 3A）。由于第 2 天的差异基因数量过少，对照组与 2 dpi 组之间的 GO 富集分析未发现显著条目。第 4 天与对照组相比，在上调基因中出现了与免疫反应（如白细胞活化）和炎症（如对肿瘤坏死因子的反应）相关的 GO 术语富集（图 3B）。到了第 6 和第 8 天，下调基因中则显现出细胞分化（如皮肤发育）等相关条目富集，这可能反映了 HSV-1 典型的对宿主细胞转录的抑制效应（host cell shutoff）[^33]。值得注意的是，在第 4 天时，病毒相关的 RNA 已占全部测序读取的约 20%，并在第 6 和第 8 天略微上升至约 25%（图 3C）。

3.4 HSV-1 感染皮肤类器官的空间转录组分析

为了全面理解 HSV-1 感染在 SkO（皮肤类器官）模型中的时间和空间动态以及宿主反应，我们采用了空间转录组技术。我们在 Xenium 平台上进行了两次具有单细胞分辨率的空间转录组实验。首次试验中分析了未感染的 SkO 以及感染后第 2 天和第 6 天的切片；在主实验中则改为分析第 2、3 和 4 天的样本（见图 4）。

我们检测了约 500 个基因的表达，包括五个病毒转录本（HSV-1 的 US1、UL27、UL29、UL54 和 LAT）。通过非整合式的转录组数据，我们为细胞分配了类型身份（图 4A 和补图 3A-D）。我们识别出了所有预期的细胞类型，包括如 Merkel 细胞这类丰度较低的类型，这与 Lee 等人使用单细胞测序和免疫染色获得的结果一致。

此外，基于转录组的细胞类型分配还能区分具有特定空间定位的亚型，例如表皮附近或远离表皮定位的不同成纤维细胞（乳头状和间充质型）。表皮的不同角质形成细胞亚型（基底层、棘层、颗粒层和角质层）也能被区分，且具有明确的空间定位。毛囊细胞类型（真皮鞘、外根鞘、内根鞘、基质、膨大区）也得以识别，且符合其预期的空间分布。

在感染的 SkO 中细胞类型识别也同样成功（图 4B）。但在某些感染严重的细胞中，特别是成纤维细胞，其转录特征丢失，无法归类，被标记为“未定义”（图 4B 上部中间区域的灰色细胞，以及补图 3C）。例如，我们观察到在高度感染的细胞中，成纤维细胞标志物 PDGFRα 的 mRNA 表达完全丧失，这表明宿主细胞转录被病毒关停（图 4C 与 4D 上面两图比较）[^33]。

随着感染程度加重（4 dpi），未定义细胞比例升高，但总体未超过总细胞数的约 15%（图 4E）。

我们接着分析了 HSV-1 的五个转录本（US1/ICP22、UL54/ICP27、UL29/ICP8、UL27/gB 和 LAT）在 SkO 中的表达，发现感染起始于真皮外层的成纤维细胞（2 dpi），并逐步蔓延至类器官内部，最终于 4 dpi 达到表皮（图 4D 和补图 3B），这一发现也与蛋白染色的结果一致。

为研究 SkO 中不同细胞类型对 HSV-1 的易感性和病毒基因表达差异，我们比较了皮肤的主要细胞类型（成纤维细胞和角质形成细胞）中感染细胞的比例。我们聚焦于数量最多的成纤维细胞亚型（乳头型1和2）以及角质形成细胞中的基底层和棘层亚型。

在统计每种细胞中至少含有 2 个病毒转录本的比例时，发现乳头型1成纤维细胞在早期感染时被感染的比例更高，这可能是因为其位于类器官外侧，更易暴露于病毒（图 4F 上图）。到了第 4 天，乳头型1和2成纤维细胞的感染比例相近，而角质形成细胞仍然具有较低的感染比例。此外，角质形成细胞中的病毒载量始终显著低于成纤维细胞（图 4F 下图）。

由于我们仅检测了五个病毒转录本，无法评估病毒完整的基因表达周期。因此，我们专注分析早期基因 UL54 和晚期 LAT 的表达。当我们按病毒载量对细胞进行排序时，在乳头型成纤维细胞中观察到了预期的模式：UL54 在初期迅速上升，随后趋于平台或下降，而 LAT 表达则持续增加（图 4G）。但在基底层角质形成细胞中，病毒复制速度明显较慢，且存在滞后期的迹象。

值得注意的是，基底层角质形成细胞位于组织较深处，而乳头型成纤维细胞（尤其是1型）则位于类器官外侧。由于病毒从外表面感染类器官，角质形成细胞的较低病毒载量可能与其感染时间较晚有关。

为进一步验证，我们分析了细胞与类器官表面距离与感染细胞比例及病毒载量之间的关系（补图 S3E+F）。我们将基底层角质形成细胞和乳头型1、2成纤维细胞按与类器官边界的距离分组，并统计每组中至少含有 2 个病毒转录本的细胞比例。结果显示，即使在靠近表面的区域，角质形成细胞中病毒阳性细胞的比例仍低于两种成纤维细胞。

在更深的区域，乳头型1成纤维细胞的病毒阳性比例反而低于2型，这可能是由于其较早感染并开始表现出细胞病变所致。有趣的是，在所有距离段中，基底层角质形成细胞中病毒阳性细胞的病毒载量始终低于相同距离的成纤维细胞。

这些结果表明，基底层角质形成细胞相比真皮中的主要成纤维细胞亚型具有更低的病毒易感性和复制能力，这种差异并非单纯由空间位置决定，而可能由细胞类型本身或其所处的细胞外结构限制所致。

3.5 毛囊减缓病毒传播速度

本研究中使用的皮肤类器官（SkOs）具备多种结构，其中包括含有多种不同来源细胞类型的毛囊（HF）。在真皮中，有一种特定的成纤维细胞亚型——真皮鞘细胞，包裹着外根鞘角质形成细胞和膨大区；而在毛囊底部，则存在另一类成纤维细胞——真皮乳头细胞。毛囊内部的角质形成细胞层包括内根鞘和基质细胞，黑色素细胞也可能嵌入在这些角质形成细胞层中。

当我们分析毛囊中各细胞类型的病毒载量时发现，它们的病毒载量显著低于周围富含病毒 RNA 的乳头型成纤维细胞（图 5A）。即使周围的乳头型成纤维细胞病毒载量极高，毛囊内部的病毒 RNA 水平仍然很低（图 5B）。我们进一步验证这种现象是否仅由毛囊细胞处于组织较内层位置所致，但结果否定了这一假设（补图 4B、C）。

我们提出三种可能的解释：

毛囊外部可能存在一圈对病毒不具易感性的细胞层；
毛囊可能被一层细胞外基质屏障所保护；
毛囊内不同细胞类型的病毒感染周期可能远慢于周围如乳头型成纤维细胞等。

为检验第一种可能性，我们在 4 dpi 时识别了围绕毛囊、邻近高度感染的乳头型成纤维细胞的真皮鞘成纤维细胞（图 5D，补图 S4A）。结果显示，这些真皮鞘细胞的病毒载量相较于外围细胞出现了明显下降（图 5C），而毛囊内部的其他细胞类型病毒载量则更低。

由于我们的空间转录组主实验仅追踪到感染后第 4 天，因此我们回顾了免疫染色数据，以确认在更晚的时间点毛囊是否最终被全面感染。确实，在更晚的时间点毛囊也显示出被感染的迹象（图 5F）。

综上所述，这些结果表明毛囊整体上并非完全不具病毒易感性，但其病毒感染周期明显更为缓慢。

3.6 HSV-1 诱导 TNF 家族细胞因子的层状表达

接下来我们研究了 HSV-1 感染引发的宿主转录组变化。HSV-1 感染不仅会导致宿主基因组转录的广泛关闭（shutoff）[^33]，还会引起转录延伸（read-through）[^34] 和反义转录（antisense transcription）[^35]，但同时也能诱导多个宿主基因表达[^36-39]。

为了识别对病毒感染有反应的宿主基因，我们在不同细胞类型中比较了感染组和未感染组的差异表达，同时考虑病毒 RNA 水平，将细胞分为三类：无病毒 RNA（HSV-1 转录本数 < 2）、低病毒 RNA（病毒载量低于该细胞类型和时间点的中位数）以及高病毒 RNA（高于中位数）。我们重点分析了两个干扰素刺激基因：IRF9 和 MX1，它们已被证实可由 DNA 病毒[^40]和 RNA 病毒[^41]感染诱导表达。

IRF9 的 mRNA 在未感染的类器官中本身就存在较高表达，感染后主要在棘层角质形成细胞中被进一步显著上调（图 6A，上图）。而 MX1 的表达则主要在乳头型成纤维细胞中上调（图 6A，下图）。在 3 dpi，MX1 在亚型 1 中的无病毒细胞和高病毒细胞中均明显上调；在 4 dpi，MX1 在亚型 2 的高病毒细胞中被诱导。这说明 IRF9 和 MX1 的上调在某些细胞中可能是间接作用的结果，比如由旁分泌信号传导介导。

在我们之前的 bulk RNA-seq 实验中，结果显示 4 dpi 时与 TNF 反应相关的基因显著上调（图 3B）。在空间转录组实验中，我们也发现编码 TNFα 的 TNF 基因和编码 4-1BBL 的 TNFSF9 基因在 HSV-1 感染中显著上调。值得注意的是，它们的表达呈明显的细胞类型依赖性，且几乎互不重叠：TNF 主要在角质形成细胞（尤其是基底层）中表达，而 TNFSF9 则在乳头型成纤维细胞中表达（图 6B、E、F）。与 IRF9 和 MX1 不同，TNF 和 TNFSF9 的表达明显依赖于病毒的存在，表达随病毒载量上升而增强，但随后达到峰值后下降，这可能与病毒诱导的宿主转录关闭有关（图 6C、D）。

我们确认这些信号并非来自病毒诱导的转录终止失效导致的“读入式转录”（read-in）[^34]。无论是本研究中的 bulk RNA-seq 数据，还是既往 HSV-1 感染成纤维细胞[^34]与 HaCat 角质形成细胞[^42]的公开数据，都未发现 TNF 和 TNFSF9 上调来自读入效应。相反，这些数据证实 TNFSF9 在成纤维细胞中上调、TNF 在角质形成细胞中上调。此外，我们还在 SkO 模型中观察到由反义转录（如 BBC3）或转录延伸（如 DUX4、CD79A、PECAM1）所导致的基因激活（补图 S5），进一步印证了 HSV-1 感染引发这些异常转录事件。

综上所述，我们观察到在皮肤类器官中 HSV-1 感染诱导了 TNF 家族成员的局部性、层状表达模式。不同细胞类型的空间分布和基因表达反应的特异性，可能形成一种对病毒感染高度协调的细胞因子表达格局。

3.7 TNF 的表达诱导周围细胞中 NF-κB 靶基因的激活

我们进一步研究了 TNF 的表达是否会在其周围细胞中引发功能性响应。首先，我们定义了靠近 TNF 表达细胞的“邻近细胞”（图 7A）。接着，我们基于一项已发表的 RNA-seq 数据集[^43]，筛选出可能由 TNFα 诱导的成纤维细胞基因。在我们进行的空间转录组实验中，有 8 个此类基因被包含在测序面板中（图 7B）。

通过比较靠近 TNF 表达细胞的成纤维细胞与远离者的基因表达差异，我们发现这些基因在邻近细胞中普遍上调，但这种上调在统计上并不显著。随后我们聚焦分析了 CXCL2——一个典型的 TNFα 诱导基因，发现在真皮中临近 TNF 表达角质形成细胞的区域，有大量细胞表达 CXCL2（图 7C）。

为了进一步排除这种现象是否可能由 4-1BBL（即 TNFSF9）诱导，我们又检查了 TNFSF9 表达高但 TNF 表达低或没有的区域。结果发现，在这些区域中，CXCL2 的表达水平明显较低（图 7D）。

综上，我们的研究结果表明：TNF 基因的细胞类型特异性表达可导致 TNFα 细胞因子的分泌，进而诱导其周围邻近细胞中一系列下游靶基因的表达。这一过程可能通过 NF-κB 信号通路介导，体现了病毒感染引发的局部免疫反应的空间协调性。

讨论

在本研究中，我们首次展示了 HSV-1 可感染复杂的人类毛发皮肤类器官（SkOs）。由于这类器官具有“由内向外”的结构，感染从真皮层开始，因此该模型最接近 HSV-1 再激活的情境——即病毒颗粒从神经细胞释放到真皮层的成纤维细胞中[^44]。这使我们的模型在研究 HSV-1 再激活引发的病理过程中具有高度相关性，因为大多数相关疾病是再激活而非初次感染的结果。例如，复发性唇疱疹每年在人群中的平均发病率约为每千人 1.6 例[^45]。此外，皮肤类器官还可用于探索再激活时如何抑制病毒裂解性感染的机制[^46]。

在分析 SkO 模型中不同细胞对 HSV-1 的易感性时，我们发现除角质层、毛囊角化区以及类器官尾部的软骨外，几乎所有细胞类型在感染 10 天（dpi）时最终均被感染。这在缺乏免疫细胞清除感染的情况下是可以预期的。角质层对 HSV-1 的抵抗力已在多项小鼠模型和离体皮肤研究中得到证实：角质层中的角化细胞可有效阻止病毒从外部感染[^6,^9]。不过，离体皮肤研究也表明，通过微针等方式从外部损伤皮肤，仅会导致表皮的轻微感染[^11,^12]。只有通过去除真皮或深层创伤造成病毒从真皮一侧进入，才能有效感染基底层角质形成细胞，并传播至更高分化的角质细胞层[^9,^47]。这进一步强调了表皮的保护作用。目前，我们尚不清楚软骨细胞对病毒的抗性感染原因。由于我们未观察到病毒基因组中的 GFP 报告基因表达，病毒可能在进入阶段就被阻断。未来实验将检验是否由于缺乏病毒受体，或是软骨细胞产生的细胞外基质构成了物理屏障。

为深入了解皮肤模型中的病毒传播与宿主反应，我们开展了空间转录组学分析。我们清晰识别了所有预期的细胞类型，并据此对病毒载量和宿主反应进行了细胞类型与空间定位相结合的分析。这使空间转录组成为研究复杂模型中病毒感染的理想方法。我们识别出多种成纤维细胞和角质形成细胞亚群，它们在病毒载量和宿主反应方面表现出显著差异。乳头状真皮中的成纤维细胞病毒载量远高于基底层和棘层角质形成细胞，说明其产生了更多的感染性病毒颗粒，从而促使 HSV-1 快速在真皮中扩散。而在毛囊中的真皮鞘和真皮乳头成纤维细胞中，病毒载量显著低于邻近乳头状成纤维细胞。离体模型研究表明，尽管真皮成纤维细胞的易感性略低于角质形成细胞[^8,^11]，但此类研究并未区分成纤维细胞亚型。我们模型中缺乏成人真皮中的主要成纤维细胞亚型——网状成纤维细胞，这可能解释了差异。在所有毛囊细胞中，我们观察到病毒感染明显延迟。鉴于毛囊在较晚时间点最终仍被感染，我们排除了这些细胞缺乏病毒受体的可能。我们推测，这种延迟可能是由于病毒基因表达在特定细胞类型中受到限制，尤其是真皮鞘和真皮乳头细胞。此外，毛囊中的角质形成细胞亚群也可能具有一定的抗感染能力，或类似基底层和棘层角质形成细胞一样，病毒基因表达受限。随着空间转录组学等高分辨率技术的发展，未来有望进一步揭示皮肤中病毒传播的动态过程。

基于我们的整体 RNA-seq 数据，我们重点分析了免疫和炎症通路的激活情况，尤其是 TNF 通路，因为其在感染第 4 天被显著上调。我们检测到两种 TNF 家族细胞因子的上调：TNF（编码 TNFα）主要在基底层角质形成细胞中表达，而 TNFSF9（编码 4-1BBL）则在乳头状成纤维细胞中表达。此外，我们观察到 TNF 靶基因在周围细胞中的诱导，以及未感染邻近细胞的预激活现象，说明 TNF 信号通路在类器官中功能活跃。先前的研究也表明 TNF 的表达与 HSV-1 在脑部感染/再激活相关[^48,^49]。我们的结果提示，TNF 家族因子的细胞类型特异性表达，加上皮肤的分层结构，可能启动了空间协调的免疫反应，从而迅速抑制病毒再激活。未来我们将进一步研究 TNF 对病毒基因表达的影响及其对未感染细胞的潜在保护作用。

总之，我们证明了 SkOs 是研究 HSV-1 皮肤感染的高度生理相关模型系统。结合单细胞空间转录组等新兴技术，我们得以前所未有的深度分析病毒的基因表达及宿主反应，这将为皮肤病毒感染研究开辟新路径。

七幅图（Seven Figures）说明的中文翻译：

图 1：SkO 分化特征分析 A. SkO 分化第 -1 天（d-1）至第 120 天（d120）的明场图像。比例尺：500µm。 B. 120 天 SkO 的整体免疫荧光染色（whole mount IF）Z 轴堆叠最大投影图像。比例尺：500µm。 C. 局部放大图像。比例尺：50µm。标记：KRT17（表皮，毛囊外根鞘）、PDGFRα（真皮成纤维细胞，真皮乳头）、TUBB3（神经细胞，Schwann 细胞）、Hoechst（核染色）、KRT20（Merkel 细胞）、KRT71（毛囊内根鞘）、KRT15（表皮基底层、毛囊外根鞘、髓质）、LHX2（膨大区、毛基质、毛芽、毛斑）、SOX2（真皮凝聚体、真皮乳头、黑色素细胞、Merkel 细胞）、S100β（Schwann 细胞、卫星胶质细胞）、NEFH（感觉神经元，大型胞体神经元）。

图 2：HSV-1 感染 SkOs 导致细胞病变效应 A. 明场（BF）和荧光（GFP）图像：120 天 SkO 未感染与 HSV-1 感染后第 2、4、6、8 天（dpi）。HSV-1 表达 CMV 启动子驱动的 GFP。比例尺：500µm。 B. HSV-1 感染 SkO 的免疫组化（IHC）染色图。总览图中的箭头表示下方高倍区域。比例尺：总览图 500µm，放大图 20µm。放大图：上图展示左侧为表皮、右侧为真皮，中间为基底层；下图显示毛囊区域。染色标记：H&E（苏木素-伊红染色）、GFP（HSV-1 表达的 GFP）、gD（HSV-1 糖蛋白 D 抗体染色）。

图 3：HSV-1 感染 SkOs 的整体转录组分析 A. 火山图显示第 2、4、6、8 dpi 与未感染 SkO 相比的宿主基因差异表达情况。 B. 使用 WebGestalt 工具对 RNA-seq 显著差异表达基因（padj ≤ 0.05，log2FC ≥1/≤-1）进行 GO 分析，显示显著富集的 GO 术语中前 10 个富集比最高的项目（FDR ≤ 0.05），并按生物功能分类。 C. 映射到 HSV-1 的读取百分比（未感染、2、4、6、8 dpi），每个柱代表一个重复。

图 4：HSV-1 感染皮肤类器官的空间转录组分析 A-B. 显示未感染（A）和感染第 3 天（B）的部分区域，细胞按细胞类型着色。 C-D. 与 A-B 相同区域，但按所示基因的表达水平着色。 E. 各时间点所选细胞类型的相对丰度。细线表示每个时间点三个类器官之间的标准差，颜色同 A-B。 F. 上图：所选细胞类型中每种至少检测到 2 个 HSV-1 转录本的细胞百分比，显示三重复和标准差；下图：病毒载量的 log10 转换分布（按病毒转录本百分比）。箱线图中线表示中位数，箱体为四分位间距，须为 1.5 倍 IQR，离群值为点状。 G. 至少含有 2 个病毒转录本的细胞按载量排序，每 20 个为一组，展示每组中两种病毒基因（转录本）在乳头型 1 成纤维细胞（左）和基底层角质形成细胞（右）中的百分比。

图 5：毛囊对病毒感染表现出更高的抵抗力 A. 毛囊中各细胞类型与乳头型 1/2 成纤维细胞的病毒载量比较。 B. HSV-1 UL54 在毛囊周围的表达情况。左图为未感染类器官片段，右图为第 4 天感染样本。上排：按细胞类型着色，标出毛囊细胞类型；下排：按 HSV-1 UL54 正规化表达量着色。橙色圈出区域为靠近高度感染的乳头成纤维细胞的真皮鞘。 C. 真皮鞘细胞通过其邻近（50μm 细胞，右边为毛囊内 >50μm 细胞。 D. Xenium 载玻片上的 DAPI、细胞表面蛋白和内部蛋白染色，橙蓝圈出为 B 中真皮鞘细胞。 F. 对感染类器官在各 dpi 时间点切片，使用抗 HSV-1 gD 的免疫组化染色。

图 6：细胞类型特异性的细胞因子诱导 A-B. 所示基因在选定细胞类型中的表达值。未感染和 2dpi 的样本按所有细胞合并，3dpi 和 4dpi 分为三组：无或仅 1 个病毒计数（-），病毒载量低于中位数（+），高于中位数（++）。中位数基于有 >1 个病毒计数的细胞计算。点的大小表示至少有一个基因计数的细胞比例，颜色表示平均表达量。黑圈表示伪总体分析中与未感染细胞显著不同。 C-D. 至少含有 2 个病毒计数的细胞按病毒载量排序，每 20 个为一组，展示 stratum basale 角质形成细胞（B）和乳头 1 型成纤维细胞（C）中 TNF、TNFSF9 和 HSV-1 UL54 在组内的百分比。 E-F. 显示第 4 天感染 SkO 的一部分，E 按细胞类型着色，F 按所示基因的表达水平着色。

图 7：TNF 诱导导致靶基因在邻近区域表达 A. 显示第 4 天感染 SkO（图 S3 左上角）。细胞着色为：TNF 表达细胞群（蓝色）、邻近其的细胞（红色）、其他（绿色）。 B. 在 TNF 表达细胞邻近与远离区域的成纤维细胞中，TNF 靶基因的差异表达分析。 C. 图 6F 中的区域展示 CXCL2 表达情况（TNF 表达细胞用红色圈出）。 D. 来自另一第 4 天感染类器官的相似区域，含 TNFSF9 表达细胞，但未导致明显的 CXCL2 表达增强。

离体皮肤模型（ex vivo skin model）vs SkO 模型（皮肤类器官模型，skin organoid model）

离体皮肤模型 vs SkO 模型

🔬 1. 来源与组成

离体皮肤模型（Ex vivo skin model）

来源于人或动物的真实皮肤组织（如外科切除组织或尸体样本）
保留完整的表皮、真皮结构，有时还包含血管、汗腺等附属结构
含有成熟的细胞类型，如角质形成细胞、成纤维细胞等

SkO 模型（Skin Organoid model）

来源于人类干细胞（如 hESC 或 hiPSC）体外诱导发育而成
形成类似发育过程的结构，包含毛囊、真皮、表皮等皮肤层次
含有多个皮肤相关细胞亚型，但为体外诱导生成，不等同于天然皮肤

🧪 2. 实验特性

离体皮肤模型

属于非活体组织，无法长期培养，活性快速衰退
有时包含部分免疫细胞，但功能受限
个体来源不同，样本间差异较大，不利于标准化研究

SkO 模型

可长期培养，模拟发育过程，近似“类活体”组织
不含天然免疫细胞，但可外源加入
由同源细胞诱导形成，重复性好，适合可控性强的系统性研究

🦠 3. 在 HSV-1 感染研究中的应用

离体皮肤模型

适合研究初次感染，病毒需通过表皮进入
感染效率通常依赖于是否有表皮创伤（如微针）
不易识别细胞亚型的感染差异

SkO 模型

更贴近 HSV-1 再激活过程，模拟病毒从真皮层渗透
结构允许较深层细胞感染，效率更高
可结合空间转录组技术识别不同细胞亚型对病毒的易感性与响应

🧬 4. 扩展性与应用场景

离体皮肤模型

来源有限，难以进行高通量筛选
难以与其他系统（如神经系统）联合模拟

SkO 模型

易于标准化和规模化制备，适合药物筛选和基因功能研究
有潜力与脑类器官联合建立“神经-皮肤轴”模型，研究病毒再激活过程

✅ 总结

离体皮肤模型更接近真实皮肤组织，但实验周期短、可控性差
SkO 模型代表一种新型的、可扩展的皮肤系统研究平台，特别适合病毒传播机制和宿主响应的系统性研究

亲子鉴定中的STR分析简要概述

Leave a reply

亲子鉴定常用短串联重复序列（STR, Short Tandem Repeats）进行个体识别和遗传关系确认。
STR 是基因组中重复次数多样、具有高度多态性的 DNA 区域，广泛分布于非编码区域（如内含子）中。

在亲子鉴定中，通常会检测多个常用的 STR 位点（如 VWA、D3S1358、FGA 等），每个位点都有多个等位基因，由不同重复数构成。

🧬 VWA基因与STR标记

VWA 实际上指的是 VWF（von Willebrand factor）基因中的一个 STR 位点。
该 STR 位点位于 VWF 基因的第 40 个内含子（intron 40）。
由于该区域位于非编码区域，重复数变化不会影响蛋白功能，因此适合作为遗传标记使用。

📌 STR在亲子检测中的优势

多态性高：能清楚地区分个体差异。
遗传稳定性强：从父母传递至子代，遵循孟德尔遗传规律。
非编码区：不会影响健康或表达，伦理上更容易接受。
检测通常包括 15~20 个 STR 位点，通过子代与父母的等位基因对比来判断亲子关系。

亲子检测依赖于 STR 的多样性和稳定性，通过比对 STR 位点上的等位基因是否匹配，来判断是否存在生物学上的父母子关系。

在重组（recombination）过程中，确实会发生父母DNA片段的交换和拼接，但拼接的边界不一定局限于基因内部，也可能发生在基因外的非编码区。具体来说：

重组（尤其是减数分裂中的同源重组）是DNA双链断裂修复的一种方式，发生在染色体的同源区域。
断裂和重接的边界可以位于基因的不同位置，包括基因内（如外显子、内含子）或基因间区（非编码区）。
因为基因组中大部分区域是非编码区（含内含子、调控区、间隔区等），重组往往发生在这些区域，以减少对功能区（如编码序列）的破坏。
重组热点（hotspots）分布不均，常在基因附近或调控区域，但具体位置高度变异。
在亲子鉴定所用的STR标记大多位于非编码区（如内含子或基因间区），部分原因正是它们不受功能选择压力影响，且易发生遗传多态。

总结：
拼接的边界并不一定在基因的编码区，重组可以发生在基因内也可以发生在基因间的非编码区域，这取决于具体的重组机制和热点位置。

两个兄弟虽然来自同一个基因库（genome reservoir），但他们看起来不同，主要原因有以下几点：

基因重组（Genetic Recombination）
在形成配子（精子或卵子）的减数分裂过程中，父母的染色体会发生重组，导致每个配子携带的基因组合都是独一无二的。两个兄弟分别来自不同的配子，因此基因组合不同。
独立分配（Independent Assortment）
染色体在减数分裂时随机分配到配子中，每个配子获得的染色体组合是随机的。这增加了兄弟间的遗传差异。
突变（Mutation）
虽然概率较低，但在配子的形成或早期胚胎发育过程中，可能发生新的基因突变，造成个体间的差异。
环境因素和表观遗传
除遗传因素外，生长环境、营养、生活习惯等也会影响外貌和性格，使兄弟表现出不同的特征。

综上所述，尽管两个兄弟来自同一基因库，但由于基因的重组、独立分配、突变以及环境影响，使他们拥有不同的基因组合和表现，从而外貌和其他性状有所差异。