德国《居留法》(AufenthG) §9 vs §18 中文译文与差异对比
§ 9(定居许可 / Niederlassungserlaubnis)中文翻译
§9 定居许可(Niederlassungserlaubnis)
(1) 定居许可是一种无期限的居留许可。只有在本法明确允许的情形下,才可以附加附条件(附加条款)。§47不受影响。(sozialgesetzbuch-sgb.de)
(2) 向外国人应当签发定居许可,如果:(sozialgesetzbuch-sgb.de)
- 该外国人已持有居留许可(Aufenthaltserlaubnis)满五年;
- 其生活费用有保障;
- 已向法定养老保险缴纳至少 60 个月强制或自愿保险费,或能证明已为获得可比的养老待遇向保险/供养机构或保险公司支出;因育儿或居家护理造成的职业中断期应相应计入;
- 在综合考虑违反公共安全或秩序的严重程度/性质,或该外国人造成的危险,并考虑其既往居留时长以及其在德国境内的联系纽带后,不存在反对签发的公共安全或秩序方面的理由;
- 若其为雇员(受雇劳动者),其就业是被允许的;
- 其持有持续从事其职业活动所需的其他许可;
- 具备足够的德语能力;
- 具备对德国联邦境内的法律与社会制度及生活状况的基础知识;并且
- 对本人及与其共同生活的家庭成员拥有足够的居住空间。
(同一款后续规定)(sozialgesetzbuch-sgb.de)
- 第7、8项条件:如果成功完成融合课程(Integrationskurs),视为已证明。
- 若因身体、精神或心理疾病/残疾而无法满足第7、8项,可不再要求。
- 另外,为避免特殊困难(Härte),也可以不再要求第7、8项。
- 如果该外国人能够用德语以简单方式进行口头交流,并且其按照 §44 Abs.3 Nr.2没有参加融合课程的权利,或按照 §44a Abs.2 Nr.3不被强制参加融合课程,也可不再要求第7、8项。
- 此外,如果该外国人因前述“疾病/残疾”等原因也无法满足第2、3项,则也可不再要求第2、3项。
(3) 对处于婚姻共同生活的配偶:只要第(2)款第1句第3、5、6项由一方配偶满足即可。若该外国人正在接受可获得认可的学校/职业教育结业证书或大学学位的教育,则可不要求第(2)款第1句第3项(养老缴费/可比养老证明)。第1句在 §26 Abs.4的情形中同样适用。(sozialgesetzbuch-sgb.de)
(3a) 对于持有 §18c(专业人才定居许可)的外国人的配偶,应当签发定居许可,如果:(sozialgesetzbuch-sgb.de)
- 与该外国人处于婚姻共同生活;
- 已持有居留许可满三年;
- 每周工作不少于 20 小时;并且
- 满足第(2)款第1句第2项、第4至第9项条件。 并且第(2)款第2至第6句相应适用;按第(3)款条件签发定居许可不受影响。
(4) 对签发定居许可所需的“持有居留许可”的期间,可计入:(sozialgesetzbuch-sgb.de)
- 曾经持有居留许可或定居许可的期间:如果该外国人在出境时持有定居许可,则可计入(但须扣除期间在德国境外、并导致定居许可失效的停留时间);最多计入四年;
- 每次在德国境外停留且未导致居留许可失效的,可最多计入六个月;
- 以学习或职业教育为目的的合法居留时间,按一半计入。
§ 18(专业人才移民基本原则;一般规定)中文翻译
§18 专业人才移民基本原则;一般规定
(1) 接纳外国雇员,应以德国作为经济与科研所在地的需求为导向,并考虑劳动力市场状况。对外国专业人才和劳动力的特别机会,旨在保障专业/劳动力基础并加强社会保障体系。相关规定应以专业人才以及具有显著职业经验的劳动力在劳动力市场与社会中的可持续融入为目标,同时注意公共安全利益。(sozialgesetzbuch-sgb.de)
(2) 依据本节为从事就业活动签发居留许可的前提是:(sozialgesetzbuch-sgb.de)
- 存在明确的具体工作岗位/工作邀约;
- 联邦就业局(Bundesagentur für Arbeit)已按 §39同意;但若法律、国家间协议或《就业条例》(Beschäftigungsverordnung)规定可无需就业局同意即可就业,则不适用该同意要求;即便无需同意,如出现 §40 Abs.2 或 Abs.3中的某种情形,仍可拒绝签发居留许可;
- 如需要执业许可(Berufsausübungserlaubnis),则该许可已获签发或已获保证;
- 已确认资格等同性(Gleichwertigkeit),或存在被认可的外国高校学位、或与德国高校学位相当的外国高校学位——只要这属于签发居留许可的条件; 4a. 外国人与雇主共同声明该工作将被实际履行;并且
- 在首次签发 §18a 或 §18b 的情形中,如果外国人在满 45 岁之后申请,则工资至少达到法定养老保险年度缴费基数上限(Beitragsbemessungsgrenze)年值的 55%,除非能证明已有足够的养老保障。
同款后续:若存在对雇佣该外国人的公共利益(尤其地区性经济或劳动力市场政策利益),可在个案中对上述条件作例外处理,尤其是在工资门槛仅略低或年龄门槛仅略超时。内政部每年最晚于上一年 12 月 31 日在联邦公报公布当年的最低工资标准。(sozialgesetzbuch-sgb.de)
(3) 本法所称“专业人才(Fachkraft)”是指:(sozialgesetzbuch-sgb.de)
- 拥有德国境内的合格职业培训,或与之等同的外国职业资格(职业培训类专业人才);或
- 拥有德国、被认可的外国,或与德国高校学位相当的外国高校学位(学术类专业人才)。
(4) 依据 §§18a、18b、18g、19c 签发的居留许可,期限为四年;如果劳动合同或就业局同意的期限更短,则按更短期限另加 3 个月,但总期限不得超过四年。(sozialgesetzbuch-sgb.de)
§9 和 §18 的核心区别(中文对比)
-
性质不同
- §9:定义并规定“定居许可/永居(无期限居留许可)”是什么,以及一般获得条件。(sozialgesetzbuch-sgb.de)
- §18:是“为了就业目的的居留许可体系”的总则/框架(专业人才移民原则、一般条件、专业人才定义、许可期限规则)。(sozialgesetzbuch-sgb.de)
-
期限不同
- §9:明确是“无期限居留许可”。(sozialgesetzbuch-sgb.de)
- §18:对应的多数就业类居留许可一般是有期限的(通常最长 4 年,或合同期+3个月但不超过4年)。(sozialgesetzbuch-sgb.de)
-
条件侧重点不同
- §9:强调“稳定融入与长期居留能力”的条件:5年居留、生活保障、60个月养老、语言、融入知识、住房、公共安全等。(sozialgesetzbuch-sgb.de)
- §18:强调“就业准入”的条件:具体工作邀约、就业局同意(或法定豁免)、必要执业许可、学历/资格认可、雇佣真实性声明,以及45岁后的工资/养老保障门槛等。(sozialgesetzbuch-sgb.de)
-
它们之间的关系
- 很多人的路径是:先在 §18 体系下拿到就业类居留许可(如 §18a/§18b/§18g 等),满足条件后再申请 §9(或某些人走 §18c 直接专业人才定居许可)。这一点从 §9(3a) 直接提到与 §18c 的关联也能看出来。(sozialgesetzbuch-sgb.de)
§18a 具备职业培训的专业人才(Fachkräfte mit Berufsausbildung)— 中文翻译
对“具备职业培训的专业人才”,应签发一项居留许可(Aufenthaltserlaubnis),用于从事任何合格的就业(qualifizierte Beschäftigung)。 (互联网法律)
§18b 具备高等教育背景的专业人才(Fachkräfte mit akademischer Ausbildung)— 中文翻译
对“具备高等教育背景的专业人才”,应签发一项居留许可(Aufenthaltserlaubnis),用于从事任何合格的就业(qualifizierte Beschäftigung)。 (sozialgesetzbuch-sgb.de)
§18c 专业人才的定居许可(Niederlassungserlaubnis für Fachkräfte)— 中文翻译
(1) 对专业人才,无需联邦就业局(BA)同意,应签发定居许可(Niederlassungserlaubnis),如果满足:
- 已持有 §18a / §18b / §18d 或 §18g 的居留身份满 3 年;
- 有一个工作岗位,且该岗位依 §18a/§18b/§18d/§18g 的条件允许由其担任;
- 已缴纳至少 36 个月法定养老保险强制或自愿缴费(或可比养老保障支出证明);
- 具备足够的德语能力;
- 同时满足 §9 Abs.2 Satz1 Nr.2 以及 Nr.4–6、8、9 的条件(并适用 §9 的若干例外规则)。 另外:若该专业人才在德国完成了职业培训或学业,上述第1项“3年”可缩短为 2年,第3项“36个月养老”可缩短为 24个月。 (sozialgesetzbuch-sgb.de)
(2) 作为蓝卡持有人(§18g),若已按 §18g 就业满 27个月并缴纳养老,且满足 §9 的相应条件,并具备“基础/简单德语”,则应签发定居许可;若德语达到“足够”,期限可缩短为 21个月。 (sozialgesetzbuch-sgb.de)
(3) 对“高度合格的、具备学术背景的专业人才”,在特殊情况下可(应当倾向于)在无需 BA 同意下签发定居许可:如果可以合理预期其能融入德国生活且无需国家救助即可维持生计,并满足 §9 Abs.2 Satz1 Nr.4(公共安全/秩序不构成反对理由)。各州还可规定此类签发需州最高主管机关(或其指定机构)同意。“高度合格”例示包括:具有特殊专业知识的科研人员;担任重要职务的教师/高级科研人员等。 (sozialgesetzbuch-sgb.de)
§18d 研究(Forschung)— 中文翻译
(1) 对外国人,无需 BA 同意,应依据欧盟指令 (EU) 2016/801 为“研究目的”签发居留许可,如果:
- 他: a) 与在德国境内为研究人员特殊准入程序而获得认可的研究机构,签署了有效的“接收协议”(Aufnahmevereinbarung)或相当合同,用于实施某项研究计划;或 b) 与从事研究的研究机构签署了有效接收协议或相当合同;并且
- 该研究机构书面承诺承担公共部门在接收协议结束后最长6个月内可能发生的费用,尤其包括: a) 该外国人在欧盟成员国非法停留期间的生活费用;以及 b) 对该外国人的遣返/驱逐费用。 并且:在(1)第1项a)情形下,居留许可应在提出申请后 60天内签发。 (sozialgesetzbuch-sgb.de)
(2) 如果研究机构的活动主要由公共资金资助,则原则上应免除(1)第2项的费用承诺要求;若该研究项目具有特别公共利益,也可以免除。并规定相关承诺的适用条款。 (sozialgesetzbuch-sgb.de)
(3) 研究机构也可以向负责其认可的主管机构作出“通用承诺”,适用于与其签署接收协议并获得研究居留许可的所有外国人。 (sozialgesetzbuch-sgb.de)
(4) 该研究居留许可一般至少签发 1年;若参加带有流动措施的欧盟/多边项目,则至少 2年;若研究项目更短,则按项目期限签发,但在“至少2年”规则的情形下,期限仍至少 1年。 (sozialgesetzbuch-sgb.de)
(5) 依本条签发的居留许可,允许在接收协议所列研究机构开展研究,并允许从事教学活动;研究项目在居留期间变更,不当然导致该许可失效。 (sozialgesetzbuch-sgb.de)
(6) 对在欧盟某成员国已获国际保护的人,如其满足(1)条件且在该成员国获保护后已居留至少 2年,可签发研究目的居留许可;(5)相应适用。 (sozialgesetzbuch-sgb.de)
§18g 欧盟蓝卡(Blaue Karte EU)— 中文翻译
(1) 对具备学术背景的专业人才,无需 BA 同意,应为其签发欧盟蓝卡,用于从事与其资格相匹配的德国境内工作,前提是:其工资至少达到法定养老保险年度缴费基数上限的 50%,且不存在 §19f 规定的拒绝理由。 但对以下两类人:
- 从事特定职业分类(ISCO-08若干组别所列职业);或
- 在申请蓝卡前不超过 3年取得高校学位者; 蓝卡改为需要 BA 同意签发,且工资门槛降低为年度缴费基数上限的 45.3%。 并且:若申请人已持有 §18b 居留许可且蓝卡工作所需执业许可与 §18b 相同,则视为满足 §18 Abs.2 Nr.3;若其在 §18b 申请时已提交与蓝卡相同的学位,则视为满足 §18 Abs.2 Nr.4。另对等同高校学位、至少三年学制的高等教育项目毕业者,也可按相应规则适用。 (sozialgesetzbuch-sgb.de)
(2) 对不满足(1)的申请人,在某些职业组别(ISCO-08中的特定组别)下,可在需要 BA 同意的情况下签发蓝卡;并在一定条件下对学历要求作特殊处理(包括:工资至少45.3%;无§19f拒绝理由;并能证明近7年内获得的、至少3年的相关职业经验,且能力水平可与高校学位相当并对岗位必需)。 (sozialgesetzbuch-sgb.de)
(3) 签发蓝卡要求:具体工作邀约所约定的雇佣期限至少 6个月。 (sozialgesetzbuch-sgb.de)
(4) 蓝卡持有人更换雇主/岗位:一般不需要外国人局许可;但在就业的前 12个月,外国人局可将岗位变更暂停最多 30天并在此期间拒绝(若不再满足蓝卡签发条件)。 (sozialgesetzbuch-sgb.de)
(5) 在某些情况下,签发蓝卡可视为生活费已保障:如果外国人持有 §18a 或 §18b 的居留许可且不更换工作岗位。 (sozialgesetzbuch-sgb.de)
(6) 蓝卡延期的特殊工资门槛:若申请人在申请延期前不超过 3年取得学位,或自首次按较低门槛((1)中45.3%那种情形)签发蓝卡以来未满 24个月,则延期时适用该较低门槛;其余仍适用一般延期规则。 (sozialgesetzbuch-sgb.de)
(7) 内政部每年在上一年 12月31日前于联邦公报公布下一年度(1)(2)所需的最低工资标准。 (sozialgesetzbuch-sgb.de)
五个条款的关键区别(中文对比)
-
§18a vs §18b(工作居留的入口)
- 都是“居留许可 Aufenthaltserlaubnis”用于合格就业;
- 差别主要在“你是职业培训型还是大学学历型专业人才”。 (互联网法律)
-
§18g(蓝卡)
- 仍是“居留许可”类型,但属于欧盟蓝卡路径;核心是学术背景 + 工资门槛(50% 或特定情形45.3%)以及对岗位变更的规则。 (sozialgesetzbuch-sgb.de)
-
§18d(研究)
- 也是“居留许可”,目的限定为研究;核心条件是接收协议/合同 + 研究机构费用承诺(以及相关豁免、期限规则)。 (sozialgesetzbuch-sgb.de)
-
§18c(定居/永居)
- 这是“定居许可 Niederlassungserlaubnis(无期限)”路径:一般要求你先持 §18a/18b/18d/18g 一段时间并满足养老、语言、§9相关条件;对蓝卡还有 27/21个月的加速路径。 (sozialgesetzbuch-sgb.de)
Protected: RND(Resistance–Nodulation–Division,耐药-结节-细胞分裂)外排泵确保鲍曼不动杆菌对人用靶向药物的固有耐受性
FASTQ / raw sequencing datasets overview (T. and F.)
1) Per-dataset sample inventory (compact lists)
1. Data_Tam_RNAseq_2024_AUM_MHB_Urine_on_ATCC19606
X101SC24105589-Z01-J001:AUM-1..3,MHB-1..3,Urine-1..3(all PE)X101SC25062155-Z01-J002:AUM-1..3,AUM-AZI-1..3,MH-1..3,MH-AZI-1..3,Urine-1..3,Urine-AZI-1..3(all PE)
2. Data_Tam_RNAseq_2025_LB-AB_IJ_W1_Y1_WT_vs_Mac-AB_IJ_W1_Y1_WT_on_ATCC19606
- LB:
LB-AB-1..3,LB-IJ-(1,2,4),LB-W1-1..3,LB-WT19606-2..4,LB-Y1-2..4 - Mac:
Mac-AB-1..3,Mac-IJ-(1,2,4),Mac-W1-1..3,Mac-WT19606-2..4,Mac-Y1-2..4
3. Data_Tam_RNAseq_2025_subMIC_exposure_on_ATCC19606
Each with reps -1..-3 (all PE):
0_5ΔIJ-17,0_5ΔIJ-24preWT-17,preWT-24preΔIJ-17,preΔIJ-24WT0_5-17,WT0_5-24WT-17,WT-24ΔIJ-17,ΔIJ-24
4. Data_Tam_DNAseq_2023_lab_strains
- A6WT – Acinetobacter baumannii ATCC19606
- A10CraA – Acinetobacter baumannii ATCC19606
- A12AYE – Acinetobacter baumannii AYE
- A1917978 – Acinetobacter baumannii ATCC17978
5. Data_Tam_DNAseq_2025_AYE-WT_Q_S_craA-Tig4_craA-1-Cm200_craA-2-Cm200
AYE-Q,AYE-S,AYE-WTonTig4,AYE-craAonTig4,AYE-craA-1onCm200,AYE-craA-2onCm200,clinical(all PE)
6. Data_Tam_DNAseq_2025_E.hormaechei-adeABadeIJ_adeIJK_CM1_CM2_on_ATCC19606
adeABadeIJ,adeIJK,CM1,CM2,HF(all PE)
7. Data_Tam_DNAseq_2025_ATCC19606-Y1Y2Y3Y4W1W2W3W4
- Illumina PE:
△adeIJ,Tig1,Tig2,W,W2,W3,W4,Y,Y2,Y3,Y4 -
Nanopore (
*_fastq_pass.tar):W1(3 tar files),W2(1),W3(2),W4(1)Y1(3),Y2(1),Y3(1),Y4(1)
8. Data_Tam_DNAseq_2026_19606deltaIJfluE
All PE; grouped by background:
19606△ABfluE:cef-1,cipro-2,dori-2,nitro-3,pip-1,polyB-3,tet-119606△IJfluE:cef-4,cipro-3,dori-1,nitro-3,pip-4,polyB-419606wtfluE:cef-1,cipro-2,dori-1,nitro-1,pip-4,polyB-4,tet-2
9. Data_Tam_DNAseq_2026_Acinetobacter_harbinensis
An6(PE)
10. Data_Tam_Metagenomics_2026
A1,A1a,A2,B1,B2(PE)
11. Data_Foong_RNAseq_2021_ATCC19606_Cm (mapping list provided)
- Batch1:
WT_1,WT_2B,C_1B,C_2,J_1,J_2 - Batch2:
Control,WT_1B,WT_2B,WT_3B,Cra_1,Cra_2,Cra_3,IJ_1B,IJ_2B,IJ_3 - Batch3:
adIJ_1,adIJ_2,crA2,crA_ab_1,crA_ab_2,crA_ab_3,adAB_1,adAB_2,adAB_ab1,adAB_ab2,adAB_ab3
12. Data_Foong_DNAseq_2025_AYE_Dark_vs_Light
Dark,Light(PE)
2) Dataset-level summary (quick lookup)
| Dataset folder | Year | Data type | Platform / format | Run / project IDs present | Samples (n) | Files (n) | Sample groups / notes |
|---|---|---|---|---|---|---|---|
Data_Tam_RNAseq_2024_AUM_MHB_Urine_on_ATCC19606/ |
2024 | RNA-seq | Illumina PE (*_1.fq.gz, *_2.fq.gz) |
X101SC24105589-Z01-J001, X101SC25062155-Z01-J002 |
27 | 54 | J001: AUM/MHB/Urine (each 1–3). J002: AUM, AUM-AZI, MH, MH-AZI, Urine, Urine-AZI (each 1–3). |
Data_Tam_RNAseq_2025_LB-AB_IJ_W1_Y1_WT_vs_Mac-AB_IJ_W1_Y1_WT_on_ATCC19606/ |
2025 | RNA-seq | Illumina PE | X101SC25015922-Z02-J002 |
30 | 60 | LB vs Mac sets; conditions AB, IJ, W1, Y1, WT19606 with listed replicates (mostly 1–3 or 2–4; IJ uses 1,2,4). |
Data_Tam_RNAseq_2025_subMIC_exposure_on_ATCC19606/ |
2025 | RNA-seq | Illumina PE | X101SC25062155-Z01-J001 |
36 | 72 | 12 condition blocks × 3 reps: preWT, preΔIJ, WT, ΔIJ, WT0_5, 0_5ΔIJ at timepoints 17 and 24. |
Data_Tam_DNAseq_2025_ATCC19606-Y1Y2Y3Y4W1W2W3W4/ |
2025 | DNA-seq | Illumina PE + Nanopore (*_fastq_pass.tar) |
Illumina: X101SC24065637-Z01-J001/J002; Nanopore: X101SC25080408-Z01-J001 |
11 (Illumina) + 13 tar archives | 22 + 13 | Illumina: △adeIJ, Tig1, Tig2, W, W2–W4, Y, Y2–Y4. Nanopore: W1(3), W2(1), W3(2), W4(1), Y1(3), Y2(1), Y3(1), Y4(1) tar files. |
Data_Tam_DNAseq_2025_AYE-WT_Q_S_craA-Tig4_craA-1-Cm200_craA-2-Cm200/ |
2025 | DNA-seq | Illumina PE | X101SC25015922-Z01-J001 |
7 | 14 | AYE variants: AYE-Q, AYE-S, AYE-WTonTig4, AYE-craAonTig4, AYE-craA-1onCm200, AYE-craA-2onCm200, plus clinical. |
Data_Tam_DNAseq_2025_E.hormaechei-adeABadeIJ_adeIJK_CM1_CM2 |
2025 | DNA-seq | Illumina PE | X101SC24115801-Z01-J001 |
5 | 10 | adeABadeIJ, adeIJK, CM1, CM2, HF. |
Data_Tam_DNAseq_2026_19606deltaIJfluE/ |
2026 | DNA-seq | Illumina PE | X101SC25116512-Z01-J003 |
20 | 40 | Three backgrounds: 19606△ABfluE* (7), 19606△IJfluE* (6), 19606wtfluE* (7) across drug tags (cef/cipro/dori/nitro/pip/polyB/tet) with replicate suffixes. |
Data_Tam_DNAseq_2026_Acinetobacter_harbinensis/ |
2026 | DNA-seq | Illumina PE | X101SC25116512-Z01-J002 |
1 | 2 | An6 (paired-end). |
Data_Tam_Metagenomics_2026/ |
2026 | Metagenomics | Illumina PE | X101SC25123808-Z01-J001 |
5 | 10 | A1, A1a, A2, B1, B2. |
Data_Foong_RNAseq_2021_ATCC19606_Cm/ |
2021 | RNA-seq | Illumina PE (symlink/mapping list shown) | (paths point to raw_data_batch1/2/3) |
27 | 54 | Batch1: WT/craA/adeIJ (each 2 reps). Batch2: Control + WT.abx + craA.abx + adeIJ.abx (various reps). Batch3: adeIJ, craA, craA.abx, adeAB, adeAB.abx (various reps). |
Data_Foong_DNAseq_2025_AYE_Dark_vs_Light/ |
2025 | DNA-seq | Illumina PE | X101SC25116512-Z01-J001 |
2 | 4 | Dark, Light. |
3) Complete list
Data_Tam_RNAseq_2024_AUM_MHB_Urine_on_ATCC19606/
./X101SC24105589-Z01-J001/01.RawData/AUM-1/AUM-1_1.fq.gz
./X101SC24105589-Z01-J001/01.RawData/AUM-1/AUM-1_2.fq.gz
./X101SC24105589-Z01-J001/01.RawData/AUM-2/AUM-2_1.fq.gz
./X101SC24105589-Z01-J001/01.RawData/AUM-2/AUM-2_2.fq.gz
./X101SC24105589-Z01-J001/01.RawData/AUM-3/AUM-3_1.fq.gz
./X101SC24105589-Z01-J001/01.RawData/AUM-3/AUM-3_2.fq.gz
./X101SC24105589-Z01-J001/01.RawData/MHB-1/MHB-1_1.fq.gz
./X101SC24105589-Z01-J001/01.RawData/MHB-1/MHB-1_2.fq.gz
./X101SC24105589-Z01-J001/01.RawData/MHB-2/MHB-2_1.fq.gz
./X101SC24105589-Z01-J001/01.RawData/MHB-2/MHB-2_2.fq.gz
./X101SC24105589-Z01-J001/01.RawData/MHB-3/MHB-3_1.fq.gz
./X101SC24105589-Z01-J001/01.RawData/MHB-3/MHB-3_2.fq.gz
./X101SC24105589-Z01-J001/01.RawData/Urine-1/Urine-1_1.fq.gz
./X101SC24105589-Z01-J001/01.RawData/Urine-1/Urine-1_2.fq.gz
./X101SC24105589-Z01-J001/01.RawData/Urine-2/Urine-2_1.fq.gz
./X101SC24105589-Z01-J001/01.RawData/Urine-2/Urine-2_2.fq.gz
./X101SC24105589-Z01-J001/01.RawData/Urine-3/Urine-3_1.fq.gz
./X101SC24105589-Z01-J001/01.RawData/Urine-3/Urine-3_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/AUM-1/AUM-1_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/AUM-1/AUM-1_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/AUM-2/AUM-2_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/AUM-2/AUM-2_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/AUM-3/AUM-3_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/AUM-3/AUM-3_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/AUM-AZI-1/AUM-AZI-1_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/AUM-AZI-1/AUM-AZI-1_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/AUM-AZI-2/AUM-AZI-2_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/AUM-AZI-2/AUM-AZI-2_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/AUM-AZI-3/AUM-AZI-3_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/AUM-AZI-3/AUM-AZI-3_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/MH-1/MH-1_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/MH-1/MH-1_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/MH-2/MH-2_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/MH-2/MH-2_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/MH-3/MH-3_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/MH-3/MH-3_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/MH-AZI-1/MH-AZI-1_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/MH-AZI-1/MH-AZI-1_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/MH-AZI-2/MH-AZI-2_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/MH-AZI-2/MH-AZI-2_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/MH-AZI-3/MH-AZI-3_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/MH-AZI-3/MH-AZI-3_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/Urine-1/Urine-1_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/Urine-1/Urine-1_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/Urine-2/Urine-2_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/Urine-2/Urine-2_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/Urine-3/Urine-3_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/Urine-3/Urine-3_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/Urine-AZI-1/Urine-AZI-1_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/Urine-AZI-1/Urine-AZI-1_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/Urine-AZI-2/Urine-AZI-2_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/Urine-AZI-2/Urine-AZI-2_2.fq.gz
./X101SC25062155-Z01-J002/01.RawData/Urine-AZI-3/Urine-AZI-3_1.fq.gz
./X101SC25062155-Z01-J002/01.RawData/Urine-AZI-3/Urine-AZI-3_2.fq.gz
Data_Tam_RNAseq_2025_LB-AB_IJ_W1_Y1_WT_vs_Mac-AB_IJ_W1_Y1_WT_on_ATCC19606/
./X101SC25015922-Z02-J002/01.RawData/LB-AB-1/LB-AB-1_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-AB-1/LB-AB-1_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-AB-2/LB-AB-2_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-AB-2/LB-AB-2_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-AB-3/LB-AB-3_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-AB-3/LB-AB-3_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-IJ-1/LB-IJ-1_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-IJ-1/LB-IJ-1_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-IJ-2/LB-IJ-2_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-IJ-2/LB-IJ-2_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-IJ-4/LB-IJ-4_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-IJ-4/LB-IJ-4_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-W1-1/LB-W1-1_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-W1-1/LB-W1-1_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-W1-2/LB-W1-2_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-W1-2/LB-W1-2_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-W1-3/LB-W1-3_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-W1-3/LB-W1-3_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-WT19606-2/LB-WT19606-2_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-WT19606-2/LB-WT19606-2_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-WT19606-3/LB-WT19606-3_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-WT19606-3/LB-WT19606-3_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-WT19606-4/LB-WT19606-4_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-WT19606-4/LB-WT19606-4_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-Y1-2/LB-Y1-2_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-Y1-2/LB-Y1-2_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-Y1-3/LB-Y1-3_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-Y1-3/LB-Y1-3_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-Y1-4/LB-Y1-4_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/LB-Y1-4/LB-Y1-4_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-AB-1/Mac-AB-1_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-AB-1/Mac-AB-1_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-AB-2/Mac-AB-2_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-AB-2/Mac-AB-2_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-AB-3/Mac-AB-3_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-AB-3/Mac-AB-3_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-IJ-1/Mac-IJ-1_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-IJ-1/Mac-IJ-1_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-IJ-2/Mac-IJ-2_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-IJ-2/Mac-IJ-2_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-IJ-4/Mac-IJ-4_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-IJ-4/Mac-IJ-4_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-W1-1/Mac-W1-1_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-W1-1/Mac-W1-1_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-W1-2/Mac-W1-2_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-W1-2/Mac-W1-2_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-W1-3/Mac-W1-3_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-W1-3/Mac-W1-3_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-WT19606-2/Mac-WT19606-2_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-WT19606-2/Mac-WT19606-2_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-WT19606-3/Mac-WT19606-3_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-WT19606-3/Mac-WT19606-3_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-WT19606-4/Mac-WT19606-4_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-WT19606-4/Mac-WT19606-4_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-Y1-2/Mac-Y1-2_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-Y1-2/Mac-Y1-2_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-Y1-3/Mac-Y1-3_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-Y1-3/Mac-Y1-3_2.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-Y1-4/Mac-Y1-4_1.fq.gz
./X101SC25015922-Z02-J002/01.RawData/Mac-Y1-4/Mac-Y1-4_2.fq.gz
Data_Tam_RNAseq_2025_subMIC_exposure_on_ATCC19606/
./X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-17-1/0_5ΔIJ-17-1_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-17-1/0_5ΔIJ-17-1_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-17-2/0_5ΔIJ-17-2_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-17-2/0_5ΔIJ-17-2_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-17-3/0_5ΔIJ-17-3_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-17-3/0_5ΔIJ-17-3_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-24-1/0_5ΔIJ-24-1_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-24-1/0_5ΔIJ-24-1_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-24-2/0_5ΔIJ-24-2_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-24-2/0_5ΔIJ-24-2_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-24-3/0_5ΔIJ-24-3_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/0_5ΔIJ-24-3/0_5ΔIJ-24-3_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preWT-17-1/preWT-17-1_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preWT-17-1/preWT-17-1_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preWT-17-2/preWT-17-2_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preWT-17-2/preWT-17-2_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preWT-17-3/preWT-17-3_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preWT-17-3/preWT-17-3_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preWT-24-1/preWT-24-1_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preWT-24-1/preWT-24-1_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preWT-24-2/preWT-24-2_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preWT-24-2/preWT-24-2_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preWT-24-3/preWT-24-3_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preWT-24-3/preWT-24-3_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preΔIJ-17-1/preΔIJ-17-1_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preΔIJ-17-1/preΔIJ-17-1_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preΔIJ-17-2/preΔIJ-17-2_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preΔIJ-17-2/preΔIJ-17-2_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preΔIJ-17-3/preΔIJ-17-3_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preΔIJ-17-3/preΔIJ-17-3_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preΔIJ-24-1/preΔIJ-24-1_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preΔIJ-24-1/preΔIJ-24-1_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preΔIJ-24-2/preΔIJ-24-2_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preΔIJ-24-2/preΔIJ-24-2_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preΔIJ-24-3/preΔIJ-24-3_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/preΔIJ-24-3/preΔIJ-24-3_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT0_5-17-1/WT0_5-17-1_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT0_5-17-1/WT0_5-17-1_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT0_5-17-2/WT0_5-17-2_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT0_5-17-2/WT0_5-17-2_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT0_5-17-3/WT0_5-17-3_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT0_5-17-3/WT0_5-17-3_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT0_5-24-1/WT0_5-24-1_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT0_5-24-1/WT0_5-24-1_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT0_5-24-2/WT0_5-24-2_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT0_5-24-2/WT0_5-24-2_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT0_5-24-3/WT0_5-24-3_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT0_5-24-3/WT0_5-24-3_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT-17-1/WT-17-1_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT-17-1/WT-17-1_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT-17-2/WT-17-2_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT-17-2/WT-17-2_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT-17-3/WT-17-3_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT-17-3/WT-17-3_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT-24-1/WT-24-1_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT-24-1/WT-24-1_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT-24-2/WT-24-2_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT-24-2/WT-24-2_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT-24-3/WT-24-3_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/WT-24-3/WT-24-3_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/ΔIJ-17-1/ΔIJ-17-1_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/ΔIJ-17-1/ΔIJ-17-1_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/ΔIJ-17-2/ΔIJ-17-2_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/ΔIJ-17-2/ΔIJ-17-2_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/ΔIJ-17-3/ΔIJ-17-3_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/ΔIJ-17-3/ΔIJ-17-3_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/ΔIJ-24-1/ΔIJ-24-1_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/ΔIJ-24-1/ΔIJ-24-1_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/ΔIJ-24-2/ΔIJ-24-2_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/ΔIJ-24-2/ΔIJ-24-2_2.fq.gz
./X101SC25062155-Z01-J001/01.RawData/ΔIJ-24-3/ΔIJ-24-3_1.fq.gz
./X101SC25062155-Z01-J001/01.RawData/ΔIJ-24-3/ΔIJ-24-3_2.fq.gz
Data_Tam_DNAseq_2025_ATCC19606-Y1Y2Y3Y4W1W2W3W4/
Illumina short-sequencing:
./X101SC24065637-Z01-J001/01.RawData/△adeIJ/△adeIJ_1.fq.gz
./X101SC24065637-Z01-J001/01.RawData/△adeIJ/△adeIJ_2.fq.gz
./X101SC24065637-Z01-J001/01.RawData/Tig1/Tig1_1.fq.gz
./X101SC24065637-Z01-J001/01.RawData/Tig1/Tig1_2.fq.gz
./X101SC24065637-Z01-J001/01.RawData/Tig2/Tig2_1.fq.gz
./X101SC24065637-Z01-J001/01.RawData/Tig2/Tig2_2.fq.gz
./X101SC24065637-Z01-J001/01.RawData/W/W_1.fq.gz
./X101SC24065637-Z01-J001/01.RawData/W/W_2.fq.gz
./X101SC24065637-Z01-J002/01.RawData/W2/W2_1.fq.gz
./X101SC24065637-Z01-J002/01.RawData/W2/W2_2.fq.gz
./X101SC24065637-Z01-J002/01.RawData/W3/W3_1.fq.gz
./X101SC24065637-Z01-J002/01.RawData/W3/W3_2.fq.gz
./X101SC24065637-Z01-J002/01.RawData/W4/W4_1.fq.gz
./X101SC24065637-Z01-J002/01.RawData/W4/W4_2.fq.gz
./X101SC24065637-Z01-J001/01.RawData/Y/Y_1.fq.gz
./X101SC24065637-Z01-J001/01.RawData/Y/Y_2.fq.gz
./X101SC24065637-Z01-J002/01.RawData/Y2/Y2_1.fq.gz
./X101SC24065637-Z01-J002/01.RawData/Y2/Y2_2.fq.gz
./X101SC24065637-Z01-J002/01.RawData/Y3/Y3_1.fq.gz
./X101SC24065637-Z01-J002/01.RawData/Y3/Y3_2.fq.gz
./X101SC24065637-Z01-J002/01.RawData/Y4/Y4_1.fq.gz
./X101SC24065637-Z01-J002/01.RawData/Y4/Y4_2.fq.gz
Nanopore long-sequencing:
./X101SC25080408-Z01-J001/Release-X101SC25080408-Z01-J001-20251009/Data-X101SC25080408-Z01-J001/W1/0710_2F_PBG50143_74807b09/W1_fastq_pass.tar
./X101SC25080408-Z01-J001/Release-X101SC25080408-Z01-J001-20251009/Data-X101SC25080408-Z01-J001/W1/0629_2H_PBG55359_f19e323f/W1_fastq_pass.tar
./X101SC25080408-Z01-J001/Release-X101SC25080408-Z01-J001-20251009/Data-X101SC25080408-Z01-J001/W1/0631_2C_PBG05153_55abe88b/W1_fastq_pass.tar
./X101SC25080408-Z01-J001/Release-X101SC25080408-Z01-J001-20251009/Data-X101SC25080408-Z01-J001/W2/0620_2C_PBG17000_6bfd0048/W2_fastq_pass.tar
./X101SC25080408-Z01-J001/Release-X101SC25080408-Z01-J001-20251009/Data-X101SC25080408-Z01-J001/W3/0710_2F_PBG50143_74807b09/W3_fastq_pass.tar
./X101SC25080408-Z01-J001/Release-X101SC25080408-Z01-J001-20251009/Data-X101SC25080408-Z01-J001/W3/0629_2H_PBG55359_f19e323f/W3_fastq_pass.tar
./X101SC25080408-Z01-J001/Release-X101SC25080408-Z01-J001-20251009/Data-X101SC25080408-Z01-J001/W4/0620_2C_PBG17000_6bfd0048/W4_fastq_pass.tar
./X101SC25080408-Z01-J001/Release-X101SC25080408-Z01-J001-20251009/Data-X101SC25080408-Z01-J001/Y1/0655_3B_PBE70655_6bbd09a4/Y1_fastq_pass.tar
./X101SC25080408-Z01-J001/Release-X101SC25080408-Z01-J001-20251009/Data-X101SC25080408-Z01-J001/Y1/0620_2C_PBG17000_6bfd0048/Y1_fastq_pass.tar
./X101SC25080408-Z01-J001/Release-X101SC25080408-Z01-J001-20251009/Data-X101SC25080408-Z01-J001/Y1/0631_2C_PBG05153_55abe88b/Y1_fastq_pass.tar
./X101SC25080408-Z01-J001/Release-X101SC25080408-Z01-J001-20251009/Data-X101SC25080408-Z01-J001/Y2/0620_2C_PBG17000_6bfd0048/Y2_fastq_pass.tar
./X101SC25080408-Z01-J001/Release-X101SC25080408-Z01-J001-20251009/Data-X101SC25080408-Z01-J001/Y3/0620_2C_PBG17000_6bfd0048/Y3_fastq_pass.tar
./X101SC25080408-Z01-J001/Release-X101SC25080408-Z01-J001-20251009/Data-X101SC25080408-Z01-J001/Y4/0620_2C_PBG17000_6bfd0048/Y4_fastq_pass.tar
Data_Tam_DNAseq_2025_AYE-WT_Q_S_craA-Tig4_craA-1-Cm200_craA-2-Cm200/
./X101SC25015922-Z01-J001/01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_1.fq.gz
./X101SC25015922-Z01-J001/01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_2.fq.gz
./X101SC25015922-Z01-J001/01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_1.fq.gz
./X101SC25015922-Z01-J001/01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_2.fq.gz
./X101SC25015922-Z01-J001/01.RawData/AYE-craAonTig4/AYE-craAonTig4_1.fq.gz
./X101SC25015922-Z01-J001/01.RawData/AYE-craAonTig4/AYE-craAonTig4_2.fq.gz
./X101SC25015922-Z01-J001/01.RawData/AYE-Q/AYE-Q_1.fq.gz
./X101SC25015922-Z01-J001/01.RawData/AYE-Q/AYE-Q_2.fq.gz
./X101SC25015922-Z01-J001/01.RawData/AYE-S/AYE-S_1.fq.gz
./X101SC25015922-Z01-J001/01.RawData/AYE-S/AYE-S_2.fq.gz
./X101SC25015922-Z01-J001/01.RawData/AYE-WTonTig4/AYE-WTonTig4_1.fq.gz
./X101SC25015922-Z01-J001/01.RawData/AYE-WTonTig4/AYE-WTonTig4_2.fq.gz
./X101SC25015922-Z01-J001/01.RawData/clinical/clinical_1.fq.gz
./X101SC25015922-Z01-J001/01.RawData/clinical/clinical_2.fq.gz
Data_Tam_DNAseq_2025_E.hormaechei-adeABadeIJ_adeIJK_CM1_CM2
./X101SC24115801-Z01-J001/01.RawData/adeABadeIJ/adeABadeIJ_1.fq.gz
./X101SC24115801-Z01-J001/01.RawData/adeABadeIJ/adeABadeIJ_2.fq.gz
./X101SC24115801-Z01-J001/01.RawData/adeIJK/adeIJK_1.fq.gz
./X101SC24115801-Z01-J001/01.RawData/adeIJK/adeIJK_2.fq.gz
./X101SC24115801-Z01-J001/01.RawData/CM1/CM1_1.fq.gz
./X101SC24115801-Z01-J001/01.RawData/CM1/CM1_2.fq.gz
./X101SC24115801-Z01-J001/01.RawData/CM2/CM2_1.fq.gz
./X101SC24115801-Z01-J001/01.RawData/CM2/CM2_2.fq.gz
./X101SC24115801-Z01-J001/01.RawData/HF/HF_1.fq.gz
./X101SC24115801-Z01-J001/01.RawData/HF/HF_2.fq.gz
Data_Tam_DNAseq_2026_19606deltaIJfluE/
./X101SC25116512-Z01-J003/01.RawData/19606△ABfluEcef-1/19606△ABfluEcef-1_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△ABfluEcef-1/19606△ABfluEcef-1_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△ABfluEcipro-2/19606△ABfluEcipro-2_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△ABfluEcipro-2/19606△ABfluEcipro-2_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△ABfluEdori-2/19606△ABfluEdori-2_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△ABfluEdori-2/19606△ABfluEdori-2_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△ABfluEnitro-3/19606△ABfluEnitro-3_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△ABfluEnitro-3/19606△ABfluEnitro-3_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△ABfluEpip-1/19606△ABfluEpip-1_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△ABfluEpip-1/19606△ABfluEpip-1_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△ABfluEpolyB-3/19606△ABfluEpolyB-3_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△ABfluEpolyB-3/19606△ABfluEpolyB-3_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△ABfluEtet-1/19606△ABfluEtet-1_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△ABfluEtet-1/19606△ABfluEtet-1_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△IJfluEcef-4/19606△IJfluEcef-4_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△IJfluEcef-4/19606△IJfluEcef-4_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△IJfluEcipro-3/19606△IJfluEcipro-3_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△IJfluEcipro-3/19606△IJfluEcipro-3_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△IJfluEdori-1/19606△IJfluEdori-1_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△IJfluEdori-1/19606△IJfluEdori-1_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△IJfluEnitro-3/19606△IJfluEnitro-3_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△IJfluEnitro-3/19606△IJfluEnitro-3_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△IJfluEpip-4/19606△IJfluEpip-4_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△IJfluEpip-4/19606△IJfluEpip-4_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△IJfluEpolyB-4/19606△IJfluEpolyB-4_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606△IJfluEpolyB-4/19606△IJfluEpolyB-4_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606wtfluEcef-1/19606wtfluEcef-1_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606wtfluEcef-1/19606wtfluEcef-1_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606wtfluEcipro-2/19606wtfluEcipro-2_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606wtfluEcipro-2/19606wtfluEcipro-2_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606wtfluEdori-1/19606wtfluEdori-1_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606wtfluEdori-1/19606wtfluEdori-1_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606wtfluEnitro-1/19606wtfluEnitro-1_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606wtfluEnitro-1/19606wtfluEnitro-1_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606wtfluEpip-4/19606wtfluEpip-4_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606wtfluEpip-4/19606wtfluEpip-4_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606wtfluEpolyB-4/19606wtfluEpolyB-4_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606wtfluEpolyB-4/19606wtfluEpolyB-4_2.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606wtfluEtet-2/19606wtfluEtet-2_1.fq.gz
./X101SC25116512-Z01-J003/01.RawData/19606wtfluEtet-2/19606wtfluEtet-2_2.fq.gz
Data_Tam_DNAseq_2026_Acinetobacter_harbinensis/
./X101SC25116512-Z01-J002/01.RawData/An6/An6_1.fq.gz
./X101SC25116512-Z01-J002/01.RawData/An6/An6_2.fq.gz
Data_Tam_Metagenomics_2026/
./X101SC25123808-Z01-J001/01.RawData/A1/A1_1.fq.gz
./X101SC25123808-Z01-J001/01.RawData/A1/A1_2.fq.gz
./X101SC25123808-Z01-J001/01.RawData/A1a/A1a_1.fq.gz
./X101SC25123808-Z01-J001/01.RawData/A1a/A1a_2.fq.gz
./X101SC25123808-Z01-J001/01.RawData/A2/A2_1.fq.gz
./X101SC25123808-Z01-J001/01.RawData/A2/A2_2.fq.gz
./X101SC25123808-Z01-J001/01.RawData/B1/B1_1.fq.gz
./X101SC25123808-Z01-J001/01.RawData/B1/B1_2.fq.gz
./X101SC25123808-Z01-J001/01.RawData/B2/B2_1.fq.gz
./X101SC25123808-Z01-J001/01.RawData/B2/B2_2.fq.gz
Data_Foong_RNAseq_2021_ATCC19606_Cm/
wt_r1_R1.fq.gz -> ../raw_data_batch1/WT_1_1.fq.gz
wt_r1_R2.fq.gz -> ../raw_data_batch1/WT_1_2.fq.gz
wt_r2_R1.fq.gz -> ../raw_data_batch1/WT_2B_1.fq.gz
wt_r2_R2.fq.gz -> ../raw_data_batch1/WT_2B_2.fq.gz
craA_r1_R1.fq.gz -> ../raw_data_batch1/C_1B_1.fq.gz
craA_r1_R2.fq.gz -> ../raw_data_batch1/C_1B_2.fq.gz
craA_r2_R1.fq.gz -> ../raw_data_batch1/C_2_1.fq.gz
craA_r2_R2.fq.gz -> ../raw_data_batch1/C_2_2.fq.gz
adeIJ_r1_R1.fq.gz -> ../raw_data_batch1/J_1_1.fq.gz
adeIJ_r1_R2.fq.gz -> ../raw_data_batch1/J_1_2.fq.gz
adeIJ_r2_R1.fq.gz -> ../raw_data_batch1/J_2_1.fq.gz
adeIJ_r2_R2.fq.gz -> ../raw_data_batch1/J_2_2.fq.gz
wt_r3_R1.fq.gz -> ../raw_data_batch2/Control_1.fq.gz
wt_r3_R2.fq.gz -> ../raw_data_batch2/Control_2.fq.gz
wt.abx_r1_R1.fq.gz -> ../raw_data_batch2/WT_1B_1.fq.gz
wt.abx_r1_R2.fq.gz -> ../raw_data_batch2/WT_1B_2.fq.gz
wt.abx_r2_R1.fq.gz -> ../raw_data_batch2/WT_2B_1.fq.gz
wt.abx_r2_R2.fq.gz -> ../raw_data_batch2/WT_2B_2.fq.gz
wt.abx_r3_R1.fq.gz -> ../raw_data_batch2/WT_3B_1.fq.gz
wt.abx_r3_R2.fq.gz -> ../raw_data_batch2/WT_3B_2.fq.gz
craA.abx_r1_R1.fq.gz -> ../raw_data_batch2/Cra_1_1.fq.gz
craA.abx_r1_R2.fq.gz -> ../raw_data_batch2/Cra_1_2.fq.gz
craA.abx_r2_R1.fq.gz -> ../raw_data_batch2/Cra_2_1.fq.gz
craA.abx_r2_R2.fq.gz -> ../raw_data_batch2/Cra_2_2.fq.gz
craA.abx_r3_R1.fq.gz -> ../raw_data_batch2/Cra_3_1.fq.gz
craA.abx_r3_R2.fq.gz -> ../raw_data_batch2/Cra_3_2.fq.gz
adeIJ.abx_r1_R1.fq.gz -> ../raw_data_batch2/IJ_1B_1.fq.gz
adeIJ.abx_r1_R2.fq.gz -> ../raw_data_batch2/IJ_1B_2.fq.gz
adeIJ.abx_r2_R1.fq.gz -> ../raw_data_batch2/IJ_2B_1.fq.gz
adeIJ.abx_r2_R2.fq.gz -> ../raw_data_batch2/IJ_2B_2.fq.gz
adeIJ.abx_r3_R1.fq.gz -> ../raw_data_batch2/IJ_3_1.fq.gz
adeIJ.abx_r3_R2.fq.gz -> ../raw_data_batch2/IJ_3_2.fq.gz
adeIJ_r3_R1.fq.gz -> ../raw_data_batch3/adIJ_1_1.fq.gz
adeIJ_r3_R2.fq.gz -> ../raw_data_batch3/adIJ_1_2.fq.gz
adeIJ_r4_R1.fq.gz -> ../raw_data_batch3/adIJ_2_1.fq.gz
adeIJ_r4_R2.fq.gz -> ../raw_data_batch3/adIJ_2_2.fq.gz
craA_r3_R1.fq.gz -> ../raw_data_batch3/crA2_1.fq.gz
craA_r3_R2.fq.gz -> ../raw_data_batch3/crA2_2.fq.gz
craA.abx_r4_R1.fq.gz -> ../raw_data_batch3/crA_ab_1_1.fq.gz
craA.abx_r4_R2.fq.gz -> ../raw_data_batch3/crA_ab_1_2.fq.gz
craA.abx_r5_R1.fq.gz -> ../raw_data_batch3/crA_ab_2_1.fq.gz
craA.abx_r5_R2.fq.gz -> ../raw_data_batch3/crA_ab_2_2.fq.gz
craA.abx_r6_R1.fq.gz -> ../raw_data_batch3/crA_ab_3_1.fq.gz
craA.abx_r6_R2.fq.gz -> ../raw_data_batch3/crA_ab_3_2.fq.gz
adeAB_r1_R1.fq.gz -> ../raw_data_batch3/adAB_1_1.fq.gz
adeAB_r1_R2.fq.gz -> ../raw_data_batch3/adAB_1_2.fq.gz
adeAB_r2_R1.fq.gz -> ../raw_data_batch3/adAB_2_1.fq.gz
adeAB_r2_R2.fq.gz -> ../raw_data_batch3/adAB_2_2.fq.gz
adeAB.abx_r1_R1.fq.gz -> ../raw_data_batch3/adAB_ab1_1.fq.gz
adeAB.abx_r1_R2.fq.gz -> ../raw_data_batch3/adAB_ab1_2.fq.gz
adeAB.abx_r2_R1.fq.gz -> ../raw_data_batch3/adAB_ab2_1.fq.gz
adeAB.abx_r2_R2.fq.gz -> ../raw_data_batch3/adAB_ab2_2.fq.gz
adeAB.abx_r3_R1.fq.gz -> ../raw_data_batch3/adAB_ab3_1.fq.gz
adeAB.abx_r3_R2.fq.gz -> ../raw_data_batch3/adAB_ab3_2.fq.gz
Data_Foong_DNAseq_2025_AYE_Dark_vs_Light/
./X101SC25116512-Z01-J001/01.RawData/Dark/Dark_1.fq.gz
./X101SC25116512-Z01-J001/01.RawData/Dark/Dark_2.fq.gz
./X101SC25116512-Z01-J001/01.RawData/Light/Light_1.fq.gz
./X101SC25116512-Z01-J001/01.RawData/Light/Light_2.fq.gz Directory Listings Summary (Disk Directories)
/media/jhuang/INTENSO
(empty; data now on
~/DATA_Intenso)
| # | Name |
|---|---|
| 1 | (empty) |
~/DATA
| # | Name |
|---|---|
| 1 | Data_Ute_MKL1 |
| 2 | Data_Ute_RNA_4_2022-11_test |
| 3 | Data_Ute_RNA_3 |
| 4 | Data_Susanne_Carotis_RNASeq_PUBLISHING |
| 5 | Data_Jiline_Yersinia_SNP |
| 6 | Data_Tam_ABAYE_RS05070_on_A_calcoaceticus_baumannii_complex_DUPLICATED_DEL |
| 7 | Data_Nicole_CRC1648 |
| 8 | Mouse_HS3ST1_12373_out |
| 9 | Mouse_HS3ST1_12175_out |
| 10 | Data_Biobakery |
| 11 | Data_Xiaobo_10x_2 |
| 12 | Data_Xiaobo_10x_3 |
| 13 | Talk_Nicole_CRC1648 |
| 14 | Talks_Bioinformatics_Meeting |
| 15 | Talks_resources |
| 16 | Data_Susanne_MPox_DAMIAN |
| 17 | Data_host_transcriptional_response |
| 18 | Talks_including_DEEP-DV |
| 19 | DOKTORARBEIT |
| 20 | Data_Susanne_MPox |
| 21 | Data_Jiline_Transposon |
| 22 | Data_Jiline_Transposon2 |
| 23 | Data_Matlab |
| 24 | deepseek-ai |
| 25 | Stick_Mi_DEL |
| 26 | TODO_shares |
| 27 | Data_Ute_RNA_4 |
| 28 | Data_Liu_PCA_plot |
| 29 | README_run_viral-ngs_inside_Docker |
| 30 | README_compare_genomes |
| 31 | mapped.bam |
| 32 | Data_Serpapi |
| 33 | Data_Ute_RNA_1_2 |
| 34 | Data_Marc_RNAseq_2024 |
| 35 | Data_Nicole_CaptureProbeSequencing |
| 36 | LOG_mapping |
| 37 | Data_Huang_Human_herpesvirus_3 |
| 38 | Data_Nicole_DAMIAN_Post-processing_Pathoprobe_FluB_Links |
| 39 | Access_to_Win7 |
| 40 | Data_DAMIAN_Post-processing_Flavivirus_and_FSME_and_Haemophilus |
| 41 | Data_Luise_Sepi_STKN |
| 42 | Data_Patricia_Sepi_7samples |
| 43 | Data_Soeren_2025_PUBLISHING |
| 44 | Data_Ben_RNAseq_2025 |
| 45 | Data_Tam_DNAseq_2025_AYE-WT_Q_S_craA-Tig4_craA-1-Cm200_craA-2-Cm200 |
| 46 | Data_Patricia_Transposon |
| 47 | Data_Patricia_Transposon_2025 |
| 48 | Colocation_Space |
| 49 | Data_Tam_Methylation_2025_empty |
| 50 | 2025-11-03_eVB-Schreiben_12-57.pdf |
| 51 | DEGs_Group1_A1-A3+A8-A10_vs_Group2_B10-B16.png |
| 52 | README.pdf |
| 53 | Data_Hannes_JCM00612 |
| 54 | 167_redundant_DEL |
| 55 | Lehre_Bioinformatik |
| 56 | Data_Ben_Boruta_Analysis |
| 57 | Data_Childrensclinic_16S_2025_DEL |
| 58 | Data_Ben_Mycobacterium_pseudoscrofulaceum |
| 59 | Foong_RNA_mSystems_Huang_Changed.txt |
| 60 | Data_Pietro_Scatturo_and_Charlotte_Uetrecht_16S_2025 |
| 61 | Data_JuliaBerger_RNASeq_SARS-CoV-2 |
| 62 | Data_PaulBongarts_S.epidermidis_HDRNA |
| 63 | Data_Ute |
| 64 | Data_Foong_DNAseq_2025_AYE_Dark_vs_Light_TODO |
| 65 | Data_Foong_RNAseq_2021_ATCC19606_Cm |
| 66 | Data_Tam_Funding |
| 67 | Data_Tam_RNAseq_2025_LB-AB_IJ_W1_Y1_WT_vs_Mac-AB_IJ_W1_Y1_WT_on_ATCC19606 |
| 68 | Data_Tam_RNAseq_2025_subMIC_exposure_on_ATCC19606 |
| 69 | Data_Tam.txt |
| 70 | Data_Tam_RNAseq_2024_AUM_MHB_Urine_on_ATCC19606 |
| 71 | Data_Tam_Metagenomics_2026 |
| 72 | Data_Michelle |
| 73 | Data_Nicole_16S_2025_Childrensclinic |
| 74 | Data_Sophie_HDV_Sequences |
| 75 | Data_Tam_DNAseq_2026_19606deltaIJfluE |
| 76 | README_nf-core |
| 77 | Data_Vero_Kymographs |
| 78 | Access_to_Win10 |
| 79 | Data_Patricia_AMRFinderPlus_2025 |
| 80 | Data_Tam_DNAseq_2025_Unknown-adeABadeIJ_adeIJK_CM1_CM2 |
| 81 | Data_Damian |
| 82 | Data_Karoline_16S |
| 83 | Data_JuliaFuchs_RNAseq_2025 |
| 84 | Data_Tam_DNAseq_2025_ATCC19606-Y1Y2Y3Y4W1W2W3W4_TODO |
| 85 | Data_Tam_DNAseq_2026_Acinetobacter_harbinensis |
| 86 | Data_Benjamin_DNAseq_2026_GE11174 |
| 87 | Data_Susanne_spatialRNA_2022.9.1_backup |
| 88 | Data_Susanne_spatialRNA |
~/DATA_A
| # | Name |
|---|---|
| 1 | Data_Damian_NEW_CREATED |
| 2 | Data_R_bubbleplots |
| 3 | Data_Ute_TRANSFERED_DEL |
| 4 | Paper_Target_capture_sequencing_MHH_PUBLISHED |
| 5 | Data_Nicole8_Lamprecht_new_PUBLISHED |
| 6 | Data_Samira_RNAseq |
~/DATA_B
| # | Name |
|---|---|
| 1 | Data_DAMIAN_endocarditis_encephalitis |
| 2 | Data_Denise_sT_PUBLISHING |
| 3 | Data_Fran2_16S_func |
| 4 | Data_Holger_5179-R1_vs_5179 |
| 5 | Antraege_ |
| 6 | Data_16S_Nicole_210222 |
| 7 | Data_Adam_Influenza_A_virus |
| 8 | Data_Anna_Efaecium_assembly |
| 9 | Data_Bactopia |
| 10 | Data_Ben_RNAseq |
| 11 | Data_Johannes_PIV3 |
| 12 | Data_Luise_Epidome_longitudinal_nose |
| 13 | Data_Manja_Hannes_Probedesign |
| 14 | Data_Marc_AD_PUBLISHING |
| 15 | Data_Marc_RNA-seq_Saureus_Review |
| 16 | Data_Nicole_16S |
| 17 | Data_Nicole_cfDNA_pathogens |
| 18 | Data_Ring_and_CSF_PegivirusC_DAMIAN |
| 19 | Data_Song_Microarray |
| 20 | Data_Susanne_Omnikron |
| 21 | Data_Viro |
| 22 | Doktorarbeit |
| 23 | Poster_Rohde_20230724 |
| 24 | Data_Django |
| 25 | Data_Holger_S.epidermidis_1585_5179_HD05 |
| 26 | Data_Manja_RNAseq_Organoids_Virus |
| 27 | Data_Holger_MT880870_MT880872_Annotation |
| 28 | Data_Soeren_RNA-seq_2022 |
| 29 | Data_Manja_RNAseq_Organoids_Merged |
| 30 | Data_Gunnar_Yersiniomics |
| 31 | Data_Manja_RNAseq_Organoids |
| 32 | Data_Susanne_Carotis_MS |
~/DATA_C
(names only; as listed)
| # | Name |
|---|---|
| 1 | 2022-10-27_IRI_manuscript_v03_JH.docx |
| 2 | 16304905.fasta |
| 3 | ’16S data manuscript_NF.docx’ |
| 4 | 180820_2_supp_4265595_sw6zjk.docx |
| 5 | 180820_2_supp_4265596_sw6zjk.docx |
| 6 | 1a_vs_3.csv |
| 7 | ‘2.05.01.05-A01 Urlaubsantrag-Shuting-beantragt.pdf’ |
| 8 | 2014SawickaBBA.pdf |
| 9 | 20160509Manuscript_NDM_OXA_mitKomm.doc |
| 10 | 220607_Agenda_monthly_meeting.pdf |
| 11 | ‘20221129 Table mutations.docx’ |
| 12 | 230602_NB501882_0428_AHKG53BGXT.zip |
| 13 | 362383173.rar |
| 14 | 562.9459.1.fa |
| 15 | 562.9459.1_rc.fa |
| 16 | ASA3P.pdf |
| 17 | All_indels_annotated_vHR.xlsx |
| 18 | ‘Amplikon_indeces_Susanne +groups.xlsx’ |
| 19 | Amplikon_indeces_Susanne.xlsx |
| 20 | GAMOLA2 |
| 21 | Data_Susanne_Carotis_spatialRNA_PUBLISHING (dead link) |
| 22 | Data_Paul_Staphylococcus_epidermidis |
| 23 | Data_Nicola_Schaltenberg_PICRUSt |
| 24 | Data_Nicola_Schaltenberg |
| 25 | Data_Nicola_Gagliani |
| 26 | Data_methylome_MMc |
| 27 | Data_Jingang |
| 28 | Data_Indra_RNASeq_GSM2262901 |
| 29 | Data_Holger_VRE |
| 30 | Data_Holger_Pseudomonas_aeruginosa_SNP |
| 31 | Data_Hannes_ChIPSeq |
| 32 | Data_Emilia_MeDIP |
| 33 | Data_ChristophFR_HepE_published |
| 34 | Data_Christopher_MeDIP_MMc_published |
| 35 | Data_Anna_Kieler_Sepi_Staemme |
| 36 | Data_Anna12_HAPDICS_final |
| 37 | Data_Anastasia_RNASeq_PUBLISHING |
| 38 | Aufnahmeantrag_komplett_10_2022.pdf |
| 39 | Astrovirus.pdf |
| 40 | COMMANDS |
| 41 | Bacterial_pipelines.txt |
| 42 | COMPSRA_uke_DEL.jar |
| 43 | ChIPSeq_pipeline_desc.docx |
| 44 | ChIPSeq_pipeline_desc.pdf |
| 45 | Comparative_genomic_analysis_of_eight_novel_haloal.pdf |
| 46 | CvO_Klassenliste_7_3.pdf |
| 47 | ‘Copy of pool_b1_CGATGT_300.xlsx’ |
| 48 | Fran_16S_Exp8-17-21-27.txt |
| 49 | HPI_DRIVE |
| 50 | HEV_aligned.fasta |
| 51 | INTENSO_DIR |
| 52 | HPI_samples_for_NGS_29.09.22.xlsx |
| 53 | Hotmail_to_Gmail |
| 54 | Indra_Thesis_161020.pdf |
| 55 | ‘LT K331A.gbk’ |
| 56 | LOG_p954_stat |
| 57 | LOG |
| 58 | Manuscript_10_02_2021.docx |
| 59 | Metagenomics_Tools_and_Insights.pdf |
| 60 | ‘Miseq Amplikon LAuf April.xlsx’ |
| 61 | NGS.tar.gz |
| 62 | Nachweis_Bakterien_Viren_im_Hochdurchsatz.pdf |
| 63 | Nicole8_Lamprecht_logs |
| 64 | Nanopore.handouts.pdf |
| 65 | ‘Norovirus paper Susanne 191105.docx’ |
| 66 | PhyloRNAalifold.pdf |
| 67 | README_R |
| 68 | README_RNAHiSwitch_DEL |
| 69 | RNA-NGS_Analysis_modul3_NanoStringNorm.zip |
| 70 | RNAConSLOptV1.2.tar.gz |
| 71 | ‘RSV GFP5 including 3`UTR.docx’ |
| 72 | SNPs_on_pangenome.txt |
| 73 | SERVER |
| 74 | R_tutorials-master.zip |
| 75 | Rawdata_Readme.pdf |
| 76 | SUB10826945_record_preview.txt |
| 77 | S_staphylococcus_annotated_diff_expr.xls |
| 78 | Snakefile_list |
| 79 | Source_Classification_Code.rds |
| 80 | Supplementary_Table_S3.xlsx |
| 81 | Untitled.ipynb |
| 82 | UniproUGENE_UserManual.pdf |
| 83 | Untitled1.ipynb |
| 84 | Untitled2.ipynb |
| 85 | Untitled3.ipynb |
| 86 | WAC6h_vs_WAP6h_down.txt |
| 87 | damian_nodbs |
| 88 | WAC6h_vs_WAP6h_up.txt |
| 89 | ‘add. Figures Hamburg_UKE.pptx’ |
| 90 | all_gene_counts_with_annotation.xlsx |
| 91 | app_flask.py |
| 92 | bengal-bay-0.1.json |
| 93 | bengal3_ac3.yml |
| 94 | call_shell_from_Ruby.png |
| 95 | bengal3ac3.yml |
| 96 | empty.fasta |
| 97 | coefficients_csaw_vs_diffreps.xlsx |
| 98 | exchange.txt |
| 99 | exdata-data-NEI_data.zip |
| 100 | genes_wac6_wap6.xls |
| 101 | go1.13.linux-amd64.tar.gz.1 |
| 102 | hev_p2-p5.fa |
| 103 | map_corrected_backup.txt |
| 104 | install_nginx_on_hamm |
| 105 | hg19.rmsk.bed |
| 106 | metadata-9563675-processed-ok.tsv |
| 107 | mkg_sprechstundenflyer_ver1b_dezember_2019.pdf |
| 108 | multiqc_config.yaml |
| 109 | p11326_OMIKRON3398_corsurv.gb |
| 110 | p11326_OMIKRON3398_corsurv.gb_converted.fna |
| 111 | parseGenbank_reformat.py |
| 112 | pangenome-snakemake-master.zip |
| 113 | ‘phylo tree draft.pdf’ |
| 114 | qiime_params.txt |
| 115 | pool_b1_CGATGT_300.zip |
| 116 | qiime_params_backup.txt |
| 117 | qiime_params_s16_s18.txt |
| 118 | snakePipes |
| 119 | results_description.html |
| 120 | rnaalihishapes.tar.gz |
| 121 | rnaseq_length_bias.pdf |
| 122 | 3932-Leber |
| 123 | BioPython |
| 124 | Biopython |
| 125 | DEEP-DV |
| 126 | DOKTORARBEIT |
| 127 | Data_16S_Arck_vaginal_stool |
| 128 | Data_16S_BS052 |
| 129 | Data_16S_Birgit |
| 130 | Data_16S_Christner |
| 131 | Data_16S_Leonie |
| 132 | Data_16S_PatientA-G_CSF |
| 133 | Data_16S_Schaltenberg |
| 134 | Data_16S_benchmark |
| 135 | Data_16S_benchmark2 |
| 136 | Data_16S_gcdh_BKV |
| 137 | Data_Alex1_Amplicon |
| 138 | Data_Alex1_SNP |
| 139 | Data_Analysis_for_Life_Science |
| 140 | Data_Anna13_vanA-Element |
| 141 | Data_Anna14_PACBIO_methylation |
| 142 | Data_Anna_C.acnes2_old_DEL |
| 143 | Data_Anna_MT880872_update |
| 144 | Data_Anna_gap_filling_agrC |
| 145 | Data_Baechlein_Hepacivirus_2018 |
| 146 | Data_Bornavirus |
| 147 | Data_CSF |
| 148 | Data_Christine_cz19-178-rothirsch-bovines-hepacivirus |
| 149 | Data_Daniela_adenovirus_WGS |
| 150 | Data_Emilia_MeDIP_DEL |
| 151 | Data_Francesco2021_16S |
| 152 | Data_Francesco2021_16S_re |
| 153 | Data_Gunnar_MS |
| 154 | Data_Hannes_RNASeq |
| 155 | Data_Holger_Efaecium_variants_PUBLISHED |
| 156 | Data_Holger_VRE_DEL |
| 157 | Data_Icebear_Damian |
| 158 | Data_Indra3_H3K4_p2_DEL |
| 159 | Data_Indra6_RNASeq_ChipSeq_Integration_DEL |
| 160 | Data_Indra_Figures |
| 161 | Data_KatjaGiersch_new_HDV |
| 162 | Data_MHH_Encephalitits_DAMIAN |
| 163 | Data_Manja_RPAChIPSeq_public |
| 164 | Data_Manuel_WGS_Yersinia |
| 165 | Data_Manuel_WGS_Yersinia2_DEL |
| 166 | Data_Manuel_WGS_Yersinia_DEL |
| 167 | Data_Marcus_tracrRNA_structures |
| 168 | Data_Mausmaki_Damian |
| 169 | Data_Nicole1_Tropheryma_whipplei |
| 170 | Data_Nicole5 |
| 171 | Data_Nicole5_77-92 |
| 172 | Data_PaulBecher_Rotavirus |
| 173 | Data_Pietschmann_HCV_Amplicon_bigFile |
| 174 | Data_Piscine_Orthoreovirus_3_in_Brown_Trout |
| 175 | Data_Proteomics |
| 176 | Data_RNABioinformatics |
| 177 | Data_RNAKinetics |
| 178 | Data_R_courses |
| 179 | Data_SARS-CoV-2 |
| 180 | Data_SARS-CoV-2_Genome_Announcement_PUBLISHED |
| 181 | Data_Seite |
| 182 | Data_Song_aggregate_sum |
| 183 | Data_Susanne_Amplicon_RdRp_orf1_2_re |
| 184 | Data_Tabea_RNASeq |
| 185 | Data_Thaiss1_Microarray_new |
| 186 | Data_Tintelnot_16S |
| 187 | Data_Wuenee_Plots |
| 188 | Data_Yang_Poster |
| 189 | Data_jupnote |
| 190 | Data_parainfluenza |
| 191 | Data_snakemake_recipe |
| 192 | Data_temp |
| 193 | Data_viGEN |
| 194 | Genomic_Data_Science |
| 195 | Learn_UGENE |
| 196 | MMcPaper |
| 197 | Manuscript_Epigenetics_Macrophage_Yersinia |
| 198 | Manuscript_RNAHiSwitch |
| 199 | MeDIP_Emilia_copy_DEL |
| 200 | Method_biopython |
| 201 | NGS |
| 202 | Okazaki-Seq_Processing |
| 203 | RNA-NGS_Analysis_modul3_NanoStringNorm |
| 204 | RNAConSLOptV1.2 |
| 205 | RNAHeliCes |
| 206 | RNA_li_HeliCes |
| 207 | RNAliHeliCes |
| 208 | RNAliHeliCes_Relatedshapes_modified |
| 209 | R_refcard |
| 210 | R_DataCamp |
| 211 | R_cats_package |
| 212 | R_tutorials-master |
| 213 | SnakeChunks |
| 214 | align_4l_on_FJ705359 |
| 215 | align_4p_on_FJ705359 |
| 216 | assembly |
| 217 | bacto |
| 218 | bam2fastq_mapping_again |
| 219 | chipster |
| 220 | damian_GUI |
| 221 | enhancer-snakemake-demo |
| 222 | hg19_gene_annotations |
| 223 | interlab_comparison_DEL |
| 224 | my_flask |
| 225 | papers |
| 226 | pangenome-snakemake_zhaoc1 |
| 227 | pyflow-epilogos |
| 228 | raw_data_rnaseq_Indra |
| 229 | test_raw_data_dnaseq |
| 230 | test_raw_data_rnaseq |
| 231 | to_Francesco |
| 232 | ukepipe |
| 233 | ukepipe_nf |
| 234 | var_www_DjangoApp_mysite2_2023-05 |
| 235 | roentgenpass.pdf |
| 236 | salmon_tx2gene_GRCh38.tsv |
| 237 | salmon_tx2gene_chrHsv1.tsv |
| 238 | ‘sample IDs_Lamprecht.xlsx’ |
| 239 | summarySCC_PM25.rds |
| 240 | untitled.py |
| 241 | tutorial-rnaseq.pdf |
| 242 | x.log |
| 243 | webapp.tar.gz |
| 244 | temp |
| 245 | temp2 |
| 246 | Data_Susanne_Amplicon_haplotype_analyses_RdRp_orf1_2_re |
| 247 | Data_Susanne_WGS_unbiased |
~/DATA_D
| # | Name |
|---|---|
| 1 | Data_Soeren_RNA-seq_2023_PUBLISHING |
| 2 | Data_Ute |
| 3 | Data_Marc_RNA-seq_Sepidermidis |
| 4 | Data_Patricia_Transposon |
| 5 | Books_DA_for_Life |
| 6 | Data_Sven |
| 7 | Datasize_calculation_based_on_coverage.txt |
| 8 | Data_Paul_HD46_1-wt_resequencing |
| 9 | Data_Sanam_DAMIAN |
| 10 | Data_Tam_variant_calling |
| 11 | Data_Samira_Manuscripts |
| 12 | Data_Silvia_VoltRon_Debug |
| 13 | Data_Pietschmann_229ECoronavirus_Mutations_2024 |
| 14 | Data_Pietschmann_229ECoronavirus_Mutations_2025 |
| 15 | Data_Birthe_Svenja_RSV_Probe3_PUBLISHING |
~/DATA_E
| # | Name |
|---|---|
| 1 | j_huang_until_201904 |
| 2 | Data_2019_April |
| 3 | Data_2019_May |
| 4 | Data_2019_June |
| 5 | Data_2019_July |
| 6 | Data_2019_August |
| 7 | Data_2019_September |
| 8 | Data_Song_RNASeq_PUBLISHED |
| 9 | Data_Laura_MP_RNASeq |
| 10 | Data_Nicole6_HEV_Swantje2 |
| 11 | Data_Becher_Damian_Picornavirus_BovHepV |
| 12 | bacteria_refseq.zip |
| 13 | bacteria_refseq |
| 14 | Data_Rotavirus |
| 15 | Data_Xiaobo_10x |
| 16 | Data_Becher_Damian_Picornavirus_BovHepV_INCOMPLETE_DEL |
~/DATA_Intenso
| # | Name |
|---|---|
| 1 | HOME_FREIBURG_DEL |
| 2 | 150810_M03701_0019_000000000-AFJFK |
| 3 | Data_Thaiss2_Microarray |
| 4 | VirtualBox_VMs_DEL |
| 5 | ‘VirtualBox VMs_DEL’ |
| 6 | ‘VirtualBox VMs2_DEL’ |
| 7 | websites |
| 8 | DATA |
| 9 | Data_Laura |
| 10 | Data_Laura_2 |
| 11 | Data_Laura_3 |
| 12 | galaxy_tools |
| 13 | Downloads2 |
| 14 | Downloads |
| 15 | mom-baby_com_cn |
| 16 | ‘VirtualBox VMs2’ |
| 17 | VirtualBox_VMs |
| 18 | CLC_Data |
| 19 | Work_Dir2 |
| 20 | Work_Dir2_SGE |
| 21 | Data_SPANDx1_Kpneumoniae_vs_Assembly1 |
| 22 | MauveOutput |
| 23 | Fastqs |
| 24 | Data_Anna3_VRE_Ausbruch |
| 25 | Work_Dir_mock_broad_mockinput |
| 26 | Work_Dir_dM_broad_mockinput |
| 27 | Data_Anna8_RNASeq_static_shake_deprecated |
| 28 | PENDRIVE_cont |
| 29 | Work_Dir_WAP_broad_mockinput |
| 30 | Work_Dir_WAC_broad_mockinput |
| 31 | Work_Dir_dP_broad_mockinput |
| 32 | Data_Nicole10_16S_interlab |
| 33 | PAPERS |
| 34 | TB |
| 35 | Data_Anna4_SNP |
| 36 | Data_Carolin1_16S |
| 37 | ChipSeq_Raw_Data3_171009_NB501882_0024_AHNGTYBGX3 |
| 38 | m_aepfelbacher_DEL.zip |
| 39 | Data_Anna7_RNASeq_Cytoscape |
| 40 | Data_Nicole9_Hund_Katze_Mega |
| 41 | Data_Anna2_CO6114 |
| 42 | Data_Nicole3_TH17_orig |
| 43 | Data_Nicole1_Tropheryma_whipplei |
| 44 | results_K27 |
| 45 | ‘VirtualBox VMs’ |
| 46 | Data_Anna6_RNASeq |
| 47 | Data_Anna1_1585_RNAseq |
| 48 | Data_Thaiss1_Microarray |
| 49 | Data_Nicole7_Anelloviruses_Polyomavirus |
| 50 | Data_Nina1_Nicole5_1-76 |
| 51 | Data_Nina1_merged |
| 52 | Data_Nicole8_Lamprecht |
| 53 | Data_Anna5_SNP |
| 54 | chipseq |
| 55 | Downloads_DEL |
| 56 | Data_Gagliani2_enriched_16S |
| 57 | Data_Gagliani1_18S_16S |
| 58 | m_aepfelbacher |
| 59 | Data_Susanne_WGS_3amplicons |
/media/jhuang/Titisee
| # | Name |
|---|---|
| 1 | Data_Anna4_SNP |
| 2 | Data_Anna5_SNP_rsync_error |
| 3 | TRASH |
| 4 | Data_Nicole6_HEV_4_SNP_calling_PE_DEL |
| 5 | Data_Nina1_Nicole7 |
| 6 | Data_Nicole6_HEV_4_SNP_calling_SE_DEL |
| 7 | 180119_M03701_0115_000000000-BFG46.zip |
| 8 | Data_Nicole10_16S_interlab_PUBLISHED |
| 9 | Anna11_assemblies |
| 10 | Anna11_trees |
| 11 | Data_Nicole6_HEV_new_orig_fastqs |
| 12 | Data_Anna9_OXA-48_or_OXA-181 |
| 13 | bengal_results_v1_2018 |
| 14 | DO.pdf |
| 15 | damian_DEL |
| 16 | MAGpy_db |
| 17 | UGENE_v1_32_data_cistrome |
| 18 | UGENE_v1_32_data_ngs_classification |
| 19 | Data_Nicole6_HEV_Swantje |
| 20 | Data_Nico_Gagliani |
| 21 | GAMOLA2_prototyp |
| 22 | Thomas_methylation_EPIC_DO |
| 23 | Data_Nicola_Schaltenberg |
| 24 | Data_Nicola_Schaltenberg_PICRUSt |
| 25 | HOME_FREIBURG |
| 26 | Data_Francesco_16S |
| 27 | 3rd_party |
| 28 | ConsPred_prokaryotic_genome_annotation |
| 29 | ‘System Volume Information’ |
| 30 | damian_v201016 |
| 31 | Data_Holger_VRE |
| 32 | Data_Holger_Pseudomonas_aeruginosa_SNP |
| 33 | Eigene_Ordner_HR |
| 34 | GAMOLA2 |
| 35 | Data_Anastasia_RNASeq |
| 36 | Data_Amir_PUBLISHED |
| 37 | ‘$RECYCLE.BIN’ |
| 38 | Data_Xiaobo_10x_3 |
| 39 | Data_Tam_DNAseq_2023_Comparative_ATCC19606_AYE_ATCC17978 |
| 40 | Data_Holger_S.epidermidis_short |
| 41 | TEMP |
| 42 | Data_Holger_S.epidermidis_long |
/media/jhuang/Elements(Denise_ChIPseq)
| # | Name |
|---|---|
| 1 | Data_Denise_LTtrunc_H3K27me3_2_results_DEL |
| 2 | Data_Denise_LTtrunc_H3K4me3_2_results_DEL |
| 3 | Data_Anna12_HAPDICS_final_not_finished_DEL |
| 4 | m_aepfelbacher_DEL |
| 5 | Data_Damian |
| 6 | ST772_DEL |
| 7 | ALL_trimmed_part_DEL |
| 8 | Data_Denise_ChIPSeq_Protocol1 |
| 9 | Data_Pietschmann_HCV_Amplicon |
| 10 | Data_Nicole6_HEV_ownMethod_new |
| 11 | HD04-1.fasta |
| 12 | RNAHiSwitch_ |
| 13 | RNAHiSwitch__ |
| 14 | RNAHiSwitch___ |
| 15 | RNAHiSwitchpaper |
| 16 | RNAHiSwitch_milestone1_DELETED |
| 17 | RNAHiSwitch_paper.tar.gz |
| 18 | RNAHiSwitch_paper_DELETED |
| 19 | RNAHiSwitch_milestone1 |
| 20 | RNAHiSwitch_paper |
| 21 | Ute_RNASeq_results |
| 22 | Ute_miRNA_results_38 |
| 23 | RNAHiSwitch |
| 24 | Data_HepE_Freiburg_PUBLISHED |
| 25 | Data_INTENSO_2022-06 |
| 26 | ‘$RECYCLE.BIN’ |
| 27 | ‘System Volume Information’ |
| 28 | Data_Anna_Mixta_hanseatica_PUBLISHED |
| 29 | coi_disclosure.docx |
| 30 | Data_Jingang |
| 31 | **Data_Susanne_16S_re_UNPUBLISHED *** |
| 32 | Data_Denise_ChIPSeq_Protocol2 |
| 33 | Data_Caroline_RNAseq_wt_timecourse |
| 34 | Data_Caroline_RNAseq_brain_organoids |
| 35 | Data_Amir_PUBLISHED_DEL |
| 36 | Data_download_virus_fam |
| 37 | Data_Gunnar_Yersiniomics_COPYFAILED_DEL |
| 38 | Data_Paul_and_Marc_Epidome_batch3 |
| 39 | ifconfig_hamm.txt |
| 40 | Data_Soeren_2023_PUBLISHING |
| 41 | Data_Birthe_Svenja_RSV_Probe3_PUBLISHING |
| 42 | Data_Ute |
| 43 | **Data_Susanne_16S_UNPUBLISHED *** |
/media/jhuang/Seagate Expansion Drive(HOffice)
| # | Name |
|---|---|
| 1 | SeagateExpansion.ico |
| 2 | Autorun.inf |
| 3 | Start_Here_Win.exe |
| 4 | Warranty.pdf |
| 5 | Start_Here_Mac.app |
| 6 | Seagate |
| 7 | HomeOffice_DIR (Data_Anna_HAPDICS_RNASeq, From_Samsung_T5) |
| 8 | DATA_COPY_FROM_178528 (copy_and_clean.sh, logfile_jhuang.log, jhuang) |
| 9 | ‘System Volume Information’ |
| 10 | ‘$RECYCLE.BIN’ |
/media/jhuang/Elements(Anna_C.arnes)
| # | Name |
|---|---|
| 1 | Data_Swantje_HEV_using_viral-ngs |
| 2 | VIPER_static_DEL |
| 3 | Data_Nicole6_HEV_Swantje1_blood |
| 4 | Data_Nicole6_HEV_benchmark |
| 5 | Data_Denise_RNASeq_GSE79958 |
| 6 | Data_16S_Leonie_from_Nico_Gaglianis |
| 7 | Fastqs_19-21 |
| 8 | ‘System Volume Information’ |
| 9 | Data_Luise_Epidome_test |
| 10 | Data_Anna_C.acnes_PUBLISHED |
| 11 | Data_Denise_LT_DNA_Bindung |
| 12 | Data_Denise_LT_K331A_RNASeq |
| 13 | Data_Luise_Epidome_batch1 |
| 14 | Data_Luise_Pseudomonas_aeruginosa_PUBLISHED |
| 15 | Data_Luise_Epidome_batch2 |
| 16 | picrust2_out_2024_2 |
| 17 | ‘$RECYCLE.BIN’ |
/media/jhuang/Seagate Expansion Drive(DATA_COPY_FROM_hamburg)
| # | Name |
|---|---|
| 1 | Autorun.inf |
| 2 | Start_Here_Win.exe |
| 3 | Warranty.pdf |
| 4 | Start_Here_Mac.app |
| 5 | Seagate |
| 6 | DATA_COPY_TRANSFER_INCOMPLETE_DEL |
| 7 | DATA_COPY_FROM_hamburg |
/media/jhuang/Seagate Expansion Drive(Seagate_1)
| # | Name |
|---|---|
| 1 | RNA_seq_analysis_tools_2013 |
| 2 | Data_Laura0 |
| 3 | Data_Petra_Arck |
| 4 | Data_Martin_mycoplasma |
| 5 | chromhmm-enhancers |
| 6 | ChromHMM_Dir |
| 7 | Data_Denise_sT_H3K4me3 |
| 8 | Data_Denise_sT_H3K27me3 |
| 9 | Start_Here_Mac.app |
| 10 | Seagate |
| 11 | Data_Nicole16_parapoxvirus |
| 12 | Project_h_rohde_Susanne_WGS_unbiased_DEL.zip |
| 13 | Data_Denise_ChIPSeq_Protocol1 |
| 14 | Data_ENNGS_pathogen_detection_pipeline_comparison |
| 15 | j_huang_201904_202002 |
| 16 | Data_Laura_ChIPseq_GSE120945 |
| 17 | batch_200314_incomplete |
| 18 | m_aepfelbacher.zip |
| 19 | m_error_DEL |
| 20 | batch_200325 |
| 21 | batch_200319 |
| 22 | GAMOLA2_prototyp |
| 23 | Data_Nicola_Gagliani |
| 24 | 2017-18_raw_data |
| 25 | Data_Arck_MeDIP |
| 26 | trimmed |
| 27 | Data_Nicole_16S_Christmas_2020_2 |
| 28 | j_huang_202007_202012 |
| 29 | Data_Nicole_16S_Christmas_2020 |
| 30 | Downloads_2021-01-18_DEL |
| 31 | Data_Laura_plasmid |
| 32 | Data_Laura_16S_2_re |
| 33 | Data_Laura_16S_2 |
| 34 | Data_Laura_16S_2re |
| 35 | Data_Laura_16S_merged |
| 36 | Downloads_DEL |
| 37 | Data_Laura_16S |
| 38 | Data_Anna12_HAPDICS_final |
| 39 | ‘$RECYCLE.BIN’ |
| 40 | ‘System Volume Information’ |
/media/jhuang/Seagate Expansion Drive(Seagate_2)
| # | Name |
|---|---|
| 1 | Data_Nicole4_TH17 |
| 2 | Start_Here_Win.exe |
| 3 | Autorun.inf |
| 4 | Warranty.pdf |
| 5 | Start_Here_Mac.app |
| 6 | Seagate |
| 7 | Data_Denise_RNASeq_trimmed_DEL |
| 8 | HD12 |
| 9 | Qi_panGenome |
| 10 | ALL |
| 11 | fastq_HPI_bw_2019_08_and_2020_02 |
| 12 | f1_R1_link.sh |
| 13 | f1_R2_link.sh |
| 14 | rtpd_files |
| 15 | m_aepfelbacher.zip |
| 16 | Data_Nicole_16S_Hamburg_Odense_Cornell_Muenster |
| 17 | HyAsP_incomplete_genomes |
| 18 | HyAsP_normal_sampled_input |
| 19 | HyAsP_complete_genomes |
| 20 | video.zip |
| 21 | sam2bedgff.pl |
| 22 | HD04.infection.hS_vs_HD04.nose.hS_annotated_degenes.xls |
| 23 | ALL83 |
| 24 | Data_Pietschmann_RSV_Probe_PUBLISHED |
| 25 | HyAsP_normal |
| 26 | Data_Manthey_16S |
| 27 | rtpd_files_DEL |
| 28 | HyAsP_bold |
| 29 | Data_HEV |
| 30 | Seq_VRE_hybridassembly |
| 31 | Data_Anna12_HAPDICS_raw_data_shovill_prokka |
| 32 | Data_Anna_HAPDICS_WGS_ALL |
| 33 | Data_HEV_Freiburg_2020 |
| 34 | Data_Nicole_HDV_Recombination_PUBLISHED |
| 35 | s_hero2x |
| 36 | 201030_M03701_0207_000000000-J57B4.zip |
| 37 | README |
| 38 | ‘README(1)’ |
| 39 | dna2.fasta.fai |
| 40 | 91.pep |
| 41 | 91.orf |
| 42 | 91.orf.fai |
| 43 | dgaston-dec-06-2012-121211124858-phpapp01.pdf |
| 44 | tileshop.fcgi |
| 45 | ppat.1009304.s016.tif |
| 46 | sequence.txt |
| 47 | ‘sequence(1).txt’ |
| 48 | GSE128169_series_matrix.txt.gz |
| 49 | GSE128169_family.soft.gz |
| 50 | Data_Anna_HAPDICS_RNASeq |
| 51 | Data_Christopher_MeDIP_MMc_PUBLISHED |
| 52 | Data_Gunnar_Yersiniomics_IMCOMPLETE_DEL |
| 53 | Data_Denise_RNASeq |
| 54 | ‘System Volume Information’ |
| 55 | ‘$RECYCLE.BIN’ |
/media/jhuang/Elements(An14_RNAs)
| # | Name |
|---|---|
| 1 | Data_Anna10_RP62A |
| 2 | Data_Nicole12_16S_Kluwe_Bunders |
| 3 | chromhmm-enhancers |
| 4 | Data_Denise_sT_Methylation |
| 5 | Data_Denise_LTtrunc_Methylation |
| 6 | Data_16S_arckNov |
| 7 | Data_Tabea_RNASeq |
| 8 | nr_gz_README |
| 9 | j_huang_raw_fq |
| 10 | ‘System Volume Information’ |
| 11 | ‘$RECYCLE.BIN’ |
| 12 | host_refs |
| 13 | Vraw |
| 14 | **Data_Susanne_Amplicon_RdRp_orf1_2 *** |
| 15 | tmp |
| 16 | Data_RNA188_Paul_Becher |
| 17 | Data_ChIPSeq_Laura |
| 18 | Data_16S_arckNov_review_PUBLISHED |
| 19 | Data_16S_arckNov_re |
| 20 | Fastqs |
| 21 | Data_Tabea_RNASeq_submission |
| 22 | Data_Anna_Cutibacterium_acnes_DEL |
| 23 | Data_Silvia_RNASeq_SUBMISSION |
| 24 | Data_Hannes_ChIPSeq |
| 25 | Data_Anna14_RNASeq_to_be_DEL |
| 26 | Data_Pietschmann_RSV_Probe2_PUBLISHED |
| 27 | Data_Holger_Klebsiella_pneumoniae_SNP_PUBLISHING |
| 28 | Data_Anna14_RNASeq_plus_public |
/media/jhuang/Elements(Indra_HAPDICS)
| # | Name |
|---|---|
| 1 | Data_Anna11_Sepdermidis_DEL |
| 2 | HD15_without_10 |
| 3 | HD31 |
| 4 | HD33 |
| 5 | HD39 |
| 6 | HD43 |
| 7 | HD46 |
| 8 | HD15_with_10 |
| 9 | HD26 |
| 10 | HD59 |
| 11 | HD25 |
| 12 | HD21 |
| 13 | HD17 |
| 14 | HD04 |
| 15 | Data_Anna11_Pair1-6_P6 |
| 16 | Data_Anna12_HAPDICS_HyAsP |
| 17 | HAPDICS_hyasp_plasmids |
| 18 | Data_Anna_HAPDICS_review |
| 19 | data_overview.txt |
| 20 | align_assem_res_DEL |
| 21 | ‘System Volume Information’ |
| 22 | EXCHANGE_DEL |
| 23 | Data_Indra_H3K4me3_public |
| 24 | Data_Gunnar_MS |
| 25 | ‘$RECYCLE.BIN’ |
| 26 | UKE_DELLWorkstation_C_Users_indbe_Desktop |
| 27 | Linux_DELLWorkstation_C_Users_indbe_VirtualBoxVMs |
| 28 | Data_Anna_HAPDICS_RNASeq_rawdata |
| 29 | Data_Indra_H3K27ac_public |
| 30 | Data_Holger_Klebsiella_pneumoniae_SNP_PUBLISHING |
| 31 | DATA_INDRA_RNASEQ |
| 32 | DATA_INDRA_CHIPSEQ |
/media/jhuang/Elements(jhuang_*)
| # | Name |
|---|---|
| 1 | ‘Install Western Digital Software for Windows.exe’ |
| 2 | ‘Install Western Digital Software for Mac.dmg’ |
| 3 | ‘System Volume Information’ |
| 4 | ‘$RECYCLE.BIN’ |
| 5 | 20250203_FS10003086_95_BTR67811-0621 |
/media/jhuang/Smarty
| # | Name |
|---|---|
| 1 | lost+found |
| 2 | Blast_db |
| 3 | temporary_files_DEL |
| 4 | ALIGN_ASSEM |
| 5 | Data_Paul_Staphylococcus_epidermidis |
| 6 | Data_16S_Degenhardt_Marius_DEL |
| 7 | Data_Gunnar_Yersiniomics_DEL |
| 8 | Data_Manja_RNAseq_Organoids_Virus |
| 9 | Data_Emilia_MeDIP |
| 10 | DjangoApp_Backup_2023-10-30 |
| 11 | ref |
| 12 | Data_Michelle_RNAseq_2025_raw_data_DEL_AFTER_UPLOAD_GEO |
Original input (as one point)
/media/jhuang/INTENSO is empty --> Now the data are on ~/DATA_Intenso
/dev/sdg1 3,7T 512K 3,7T 1% /media/jhuang/INTENSO
jhuang@WS-2290C:~/DATA$ ls -tlrh
total 1,6M
drwxrwxrwx 6 jhuang jhuang 4,0K Okt 26 2022 Data_Ute_MKL1
drwxrwxrwx 8 jhuang jhuang 4,0K Jan 13 2023 Data_Ute_RNA_4_2022-11_test
drwxrwxr-x 7 jhuang jhuang 4,0K Mär 8 2023 Data_Ute_RNA_3
drwxr-xr-x 11 jhuang jhuang 4,0K Dez 19 2023 Data_Susanne_Carotis_RNASeq_PUBLISHING
drwxr-xr-x 21 jhuang jhuang 4,0K Jun 18 2024 Data_Jiline_Yersinia_SNP
drwxrwxr-x 5 jhuang jhuang 4,0K Jul 22 2024 Data_Tam_ABAYE_RS05070_on_A_calcoaceticus_baumannii_complex_DUPLICATED_DEL
drwxr-xr-x 2 jhuang jhuang 4,0K Jul 23 2024 Data_Nicole_CRC1648
drwxr-xr-x 4 jhuang jhuang 4,0K Sep 6 2024 Mouse_HS3ST1_12373_out
drwxr-xr-x 4 jhuang jhuang 4,0K Sep 6 2024 Mouse_HS3ST1_12175_out
drwxrwxr-x 10 jhuang jhuang 4,0K Sep 12 2024 Data_Biobakery
drwxrwxr-x 6 jhuang jhuang 4,0K Sep 23 2024 Data_Xiaobo_10x_2
drwxr-xr-x 4 jhuang jhuang 4,0K Sep 23 2024 Data_Xiaobo_10x_3
drwxr-xr-x 3 jhuang jhuang 4,0K Sep 26 2024 Talk_Nicole_CRC1648
drwxr-xr-x 2 jhuang jhuang 4,0K Sep 26 2024 Talks_Bioinformatics_Meeting
drwxr-xr-x 2 jhuang jhuang 12K Sep 26 2024 Talks_resources
drwxrwxr-x 6 jhuang jhuang 12K Okt 10 2024 Data_Susanne_MPox_DAMIAN
drwxrwxr-x 3 jhuang jhuang 4,0K Okt 14 2024 Data_host_transcriptional_response
drwxr-xr-x 13 jhuang jhuang 4,0K Okt 23 2024 Talks_including_DEEP-DV
drwxrwxr-x 2 jhuang jhuang 4,0K Okt 24 2024 DOKTORARBEIT
drwxrwxr-x 18 jhuang jhuang 4,0K Nov 11 2024 Data_Susanne_MPox
drwxrwxr-x 25 jhuang jhuang 12K Nov 11 2024 Data_Jiline_Transposon
drwxrwxr-x 16 jhuang jhuang 20K Nov 25 2024 Data_Jiline_Transposon2
drwxrwxr-x 3 jhuang jhuang 4,0K Dez 13 2024 Data_Matlab
drwxrwxr-x 5 jhuang jhuang 4,0K Jan 28 2025 deepseek-ai
drwx------ 4 jhuang jhuang 4,0K Feb 5 2025 Stick_Mi_DEL
-rw-rw-r-- 1 jhuang jhuang 1,1K Feb 18 2025 TODO_shares
drwxrwxrwx 13 jhuang jhuang 4,0K Mär 3 2025 Data_Ute_RNA_4
drwxrwxr-x 2 jhuang jhuang 4,0K Mär 31 2025 Data_Liu_PCA_plot
-rw-rw-r-- 1 jhuang jhuang 43K Apr 3 2025 README_run_viral-ngs_inside_Docker
-rw-rw-r-- 1 jhuang jhuang 8,7K Apr 9 2025 README_compare_genomes
-rw-rw-r-- 1 jhuang jhuang 0 Apr 11 2025 mapped.bam
drwxrwxr-x 3 jhuang jhuang 4,0K Apr 24 2025 Data_Serpapi
drwxrwxrwx 22 jhuang jhuang 4,0K Apr 30 2025 Data_Ute_RNA_1_2
drwxrwxr-x 15 jhuang jhuang 4,0K Apr 30 2025 Data_Marc_RNAseq_2024
drwxrwxr-x 45 jhuang jhuang 12K Mai 15 2025 Data_Nicole_CaptureProbeSequencing
-rw-rw-r-- 1 jhuang jhuang 657 Mai 23 2025 LOG_mapping
drwxrwxr-x 46 jhuang jhuang 4,0K Mai 26 2025 Data_Huang_Human_herpesvirus_3
drwxrwxr-x 8 jhuang jhuang 4,0K Jun 13 2025 Data_Nicole_DAMIAN_Post-processing_Pathoprobe_FluB_Links
lrwxrwxrwx 1 jhuang jhuang 37 Jun 16 2025 Access_to_Win7 -> ./Data_Marius_16S/picrust2_out_2024_2
drwxrwxr-x 17 jhuang jhuang 4,0K Jun 18 2025 Data_DAMIAN_Post-processing_Flavivirus_and_FSME_and_Haemophilus
drwxr-xr-x 42 jhuang jhuang 36K Jun 23 2025 Data_Luise_Sepi_STKN
drwxrwxr-x 29 jhuang jhuang 20K Jul 22 2025 Data_Patricia_Sepi_7samples
drwxr-xr-x 9 jhuang jhuang 4,0K Aug 8 2025 Data_Soeren_2025_PUBLISHING
drwxrwxr-x 9 jhuang jhuang 4,0K Aug 13 2025 Data_Ben_RNAseq_2025
drwxrwxr-x 34 jhuang jhuang 12K Sep 3 12:18 Data_Tam_DNAseq_2025_AYE-WT_Q_S_craA-Tig4_craA-1-Cm200_craA-2-Cm200
drwxrwxr-x 50 jhuang jhuang 16K Okt 6 17:59 Data_Patricia_Transposon
drwxrwxr-x 23 jhuang jhuang 12K Okt 20 13:27 Data_Patricia_Transposon_2025
drwxrwxr-x 2 jhuang jhuang 4,0K Okt 23 12:21 Colocation_Space
drwxrwxr-x 2 jhuang jhuang 4,0K Okt 27 12:56 Data_Tam_Methylation_2025_empty
-rw-rw-r-- 1 jhuang jhuang 151K Nov 3 13:01 2025-11-03_eVB-Schreiben_12-57.pdf
-rw-rw-r-- 1 jhuang jhuang 67K Nov 5 16:59 DEGs_Group1_A1-A3+A8-A10_vs_Group2_B10-B16.png
-rw-rw-r-- 1 jhuang jhuang 687K Nov 14 09:55 README.pdf
drwxrwxr-x 2 jhuang jhuang 4,0K Nov 24 15:43 Data_Hannes_JCM00612
drwxrwxr-x 3 jhuang jhuang 4,0K Dez 4 17:03 167_redundant_DEL
drwxrwxr-x 2 jhuang jhuang 4,0K Dez 8 10:33 Lehre_Bioinformatik
drwxrwxr-x 27 jhuang jhuang 12K Dez 8 11:29 Data_Ben_Boruta_Analysis
drwxrwxr-x 18 jhuang jhuang 4,0K Dez 8 17:39 Data_Childrensclinic_16S_2025_DEL
drwxrwxr-x 2 jhuang jhuang 4,0K Dez 10 10:05 Data_Ben_Mycobacterium_pseudoscrofulaceum
-rw-rw-r-- 1 jhuang jhuang 8,9K Dez 15 12:42 Foong_RNA_mSystems_Huang_Changed.txt
drwxrwxr-x 22 jhuang jhuang 4,0K Dez 17 13:07 Data_Pietro_Scatturo_and_Charlotte_Uetrecht_16S_2025
drwxrwxr-x 8 jhuang jhuang 4,0K Dez 18 10:45 Data_JuliaBerger_RNASeq_SARS-CoV-2
drwxrwxr-x 19 jhuang jhuang 4,0K Jan 3 17:42 Data_PaulBongarts_S.epidermidis_HDRNA
lrwxrwxrwx 1 jhuang jhuang 31 Jan 12 14:30 Data_Ute -> /media/jhuang/Elements/Data_Ute
drwxrwxr-x 12 jhuang jhuang 4,0K Jan 16 12:44 Data_Foong_DNAseq_2025_AYE_Dark_vs_Light_TODO
drwxrwxrwx 22 jhuang jhuang 4,0K Jan 16 12:48 Data_Foong_RNAseq_2021_ATCC19606_Cm
drwxrwxr-x 2 jhuang jhuang 4,0K Jan 16 13:02 Data_Tam_Funding
drwxrwxr-x 9 jhuang jhuang 4,0K Jan 16 13:32 Data_Tam_RNAseq_2025_LB-AB_IJ_W1_Y1_WT_vs_Mac-AB_IJ_W1_Y1_WT_on_ATCC19606
drwxrwxr-x 12 jhuang jhuang 4,0K Jan 16 13:32 Data_Tam_RNAseq_2025_subMIC_exposure_on_ATCC19606
-rw-rw-r-- 1 jhuang jhuang 1,2K Jan 16 13:34 Data_Tam.txt
drwxrwxr-x 16 jhuang jhuang 4,0K Jan 16 13:37 Data_Tam_RNAseq_2024_AUM_MHB_Urine_on_ATCC19606
drwxrwxr-x 10 jhuang jhuang 4,0K Jan 16 18:22 Data_Tam_Metagenomics_2026
drwxrwxr-x 6 jhuang jhuang 16K Jan 23 16:35 Data_Michelle
drwxrwxr-x 38 jhuang jhuang 12K Jan 28 15:20 Data_Nicole_16S_2025_Childrensclinic
drwxr-xr-x 145 jhuang jhuang 36K Jan 29 10:49 Data_Sophie_HDV_Sequences
drwxrwxr-x 4 jhuang jhuang 4,0K Jan 30 11:44 Data_Tam_DNAseq_2026_19606deltaIJfluE
-rw-rw-r-- 1 jhuang jhuang 63K Jan 30 17:53 README_nf-core
drwxrwxr-x 22 jhuang jhuang 4,0K Feb 4 10:43 Data_Vero_Kymographs
drwxrwxr-x 13 jhuang jhuang 4,0K Feb 4 14:06 Access_to_Win10
drwxrwxr-x 7 jhuang jhuang 4,0K Feb 5 11:59 Data_Patricia_AMRFinderPlus_2025
drwxrwxr-x 45 jhuang jhuang 4,0K Feb 6 11:54 Data_Tam_DNAseq_2025_Unknown-adeABadeIJ_adeIJK_CM1_CM2
drwxrwxr-x 41 jhuang jhuang 12K Feb 9 15:11 Data_Damian
drwxrwxr-x 6 jhuang jhuang 4,0K Feb 13 12:48 Data_Karoline_16S
drwxrwxr-x 13 jhuang jhuang 12K Feb 13 18:09 Data_JuliaFuchs_RNAseq_2025
drwxrwxr-x 18 jhuang jhuang 4,0K Feb 16 11:19 Data_Tam_DNAseq_2025_ATCC19606-Y1Y2Y3Y4W1W2W3W4_TODO
drwxrwxr-x 34 jhuang jhuang 4,0K Feb 16 15:54 Data_Tam_DNAseq_2026_Acinetobacter_harbinensis
drwxrwxr-x 19 jhuang jhuang 4,0K Feb 16 17:13 Data_Benjamin_DNAseq_2026_GE11174
drwxrwxrwx 36 jhuang jhuang 12K Feb 17 15:02 Data_Susanne_spatialRNA_2022.9.1_backup
drwxrwxr-x 39 jhuang jhuang 12K Feb 17 15:12 Data_Susanne_spatialRNA
jhuang@WS-2290C:~/DATA_A$ ls -ltrh
total 24K
drwxr-xr-x 7 jhuang jhuang 4,0K Jun 18 2024 Data_Damian_NEW_CREATED
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_R_bubbleplots
drwxr-xr-x 16 jhuang jhuang 4,0K Jun 18 2024 Data_Ute_TRANSFERED_DEL
drwxr-xr-x 2 jhuang jhuang 4,0K Okt 7 2024 Paper_Target_capture_sequencing_MHH_PUBLISHED
drwxr-xr-x 20 jhuang jhuang 4,0K Okt 8 2024 Data_Nicole8_Lamprecht_new_PUBLISHED
drwxrwxrwx 8 jhuang jhuang 4,0K Mai 21 2025 Data_Samira_RNAseq
jhuang@WS-2290C:~/DATA_B$ ls -tlrh
total 136K
drwxr-xr-x 3 jhuang jhuang 4,0K Jun 18 2024 Data_DAMIAN_endocarditis_encephalitis
drwxr-xr-x 8 jhuang jhuang 4,0K Jun 18 2024 Data_Denise_sT_PUBLISHING
drwxr-xr-x 12 jhuang jhuang 4,0K Jun 18 2024 Data_Fran2_16S_func
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Holger_5179-R1_vs_5179
drwxr-xr-x 16 jhuang jhuang 4,0K Jun 18 2024 Antraege_
drwxr-xr-x 18 jhuang jhuang 4,0K Jun 18 2024 Data_16S_Nicole_210222
drwxr-xr-x 6 jhuang jhuang 4,0K Jun 18 2024 Data_Adam_Influenza_A_virus
drwxr-xr-x 14 jhuang jhuang 12K Jun 18 2024 Data_Anna_Efaecium_assembly
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Bactopia
drwxr-xr-x 5 jhuang jhuang 4,0K Jun 18 2024 Data_Ben_RNAseq
drwxr-xr-x 7 jhuang jhuang 4,0K Jun 18 2024 Data_Johannes_PIV3
drwxr-xr-x 19 jhuang jhuang 4,0K Jun 18 2024 Data_Luise_Epidome_longitudinal_nose
drwxr-xr-x 6 jhuang jhuang 4,0K Jun 18 2024 Data_Manja_Hannes_Probedesign
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Marc_AD_PUBLISHING
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Marc_RNA-seq_Saureus_Review
drwxr-xr-x 17 jhuang jhuang 4,0K Jun 18 2024 Data_Nicole_16S
drwxr-xr-x 3 jhuang jhuang 4,0K Jun 18 2024 Data_Nicole_cfDNA_pathogens
drwxr-xr-x 16 jhuang jhuang 4,0K Jun 18 2024 Data_Ring_and_CSF_PegivirusC_DAMIAN
drwxr-xr-x 4 jhuang jhuang 4,0K Jun 18 2024 Data_Song_Microarray
drwxr-xr-x 11 jhuang jhuang 4,0K Jun 18 2024 Data_Susanne_Omnikron
drwxr-xr-x 3 jhuang jhuang 4,0K Jun 18 2024 Data_Viro
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Doktorarbeit
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Poster_Rohde_20230724
drwxr-xr-x 6 jhuang jhuang 4,0K Jul 12 2024 Data_Django
drwxr-xr-x 35 jhuang jhuang 4,0K Okt 21 2024 Data_Holger_S.epidermidis_1585_5179_HD05
drwxr-xr-x 9 jhuang jhuang 4,0K Nov 18 2024 Data_Manja_RNAseq_Organoids_Virus
drwxr-xr-x 2 jhuang jhuang 4,0K Feb 21 2025 Data_Holger_MT880870_MT880872_Annotation
drwxr-xr-x 12 jhuang jhuang 4,0K Apr 8 2025 Data_Soeren_RNA-seq_2022
drwxr-xr-x 5 jhuang jhuang 4,0K Apr 11 2025 Data_Manja_RNAseq_Organoids_Merged
drwxr-xr-x 24 jhuang jhuang 4,0K Apr 25 2025 Data_Gunnar_Yersiniomics
drwxr-xr-x 10 jhuang jhuang 4,0K Jan 16 17:14 Data_Manja_RNAseq_Organoids
drwxr-xr-x 3 jhuang jhuang 4,0K Feb 17 12:11 Data_Susanne_Carotis_MS
jhuang@WS-2290C:~/DATA_C$ ls -tlrh
total 13G
-rwxr-xr-x 1 jhuang jhuang 1,7M Jun 18 2024 2022-10-27_IRI_manuscript_v03_JH.docx
-rwxr-xr-x 1 jhuang jhuang 7,1K Jun 18 2024 16304905.fasta
-rwxr-xr-x 1 jhuang jhuang 55K Jun 18 2024 '16S data manuscript_NF.docx'
-rwxr-xr-x 1 jhuang jhuang 792K Jun 18 2024 180820_2_supp_4265595_sw6zjk.docx
-rwxr-xr-x 1 jhuang jhuang 17K Jun 18 2024 180820_2_supp_4265596_sw6zjk.docx
-rwxr-xr-x 1 jhuang jhuang 12K Jun 18 2024 1a_vs_3.csv
-rwxr-xr-x 1 jhuang jhuang 90K Jun 18 2024 '2.05.01.05-A01 Urlaubsantrag-Shuting-beantragt.pdf'
-rwxr-xr-x 1 jhuang jhuang 708K Jun 18 2024 2014SawickaBBA.pdf
-rwxr-xr-x 1 jhuang jhuang 61K Jun 18 2024 20160509Manuscript_NDM_OXA_mitKomm.doc
-rwxr-xr-x 1 jhuang jhuang 289K Jun 18 2024 220607_Agenda_monthly_meeting.pdf
-rwxr-xr-x 1 jhuang jhuang 14K Jun 18 2024 '20221129 Table mutations.docx'
-rwxr-xr-x 1 jhuang jhuang 12G Jun 18 2024 230602_NB501882_0428_AHKG53BGXT.zip
-rwxr-xr-x 1 jhuang jhuang 107K Jun 18 2024 362383173.rar
-rwxr-xr-x 1 jhuang jhuang 128K Jun 18 2024 562.9459.1.fa
-rwxr-xr-x 1 jhuang jhuang 126K Jun 18 2024 562.9459.1_rc.fa
-rwxr-xr-x 1 jhuang jhuang 1,6M Jun 18 2024 ASA3P.pdf
-rwxr-xr-x 1 jhuang jhuang 21K Jun 18 2024 All_indels_annotated_vHR.xlsx
-rwxr-xr-x 1 jhuang jhuang 11K Jun 18 2024 'Amplikon_indeces_Susanne +groups.xlsx'
-rwxr-xr-x 1 jhuang jhuang 9,6K Jun 18 2024 Amplikon_indeces_Susanne.xlsx
-rwxr-xr-x 1 jhuang jhuang 68 Jun 18 2024 GAMOLA2
-rwxr-xr-x 1 jhuang jhuang 88 Jun 18 2024 Data_Susanne_Carotis_spatialRNA_PUBLISHING
-rwxr-xr-x 1 jhuang jhuang 112 Jun 18 2024 Data_Paul_Staphylococcus_epidermidis
-rwxr-xr-x 1 jhuang jhuang 118 Jun 18 2024 Data_Nicola_Schaltenberg_PICRUSt
-rwxr-xr-x 1 jhuang jhuang 100 Jun 18 2024 Data_Nicola_Schaltenberg
-rwxr-xr-x 1 jhuang jhuang 94 Jun 18 2024 Data_Nicola_Gagliani
-rwxr-xr-x 1 jhuang jhuang 96 Jun 18 2024 Data_methylome_MMc
-rwxr-xr-x 1 jhuang jhuang 78 Jun 18 2024 Data_Jingang
-rwxr-xr-x 1 jhuang jhuang 112 Jun 18 2024 Data_Indra_RNASeq_GSM2262901
-rwxr-xr-x 1 jhuang jhuang 84 Jun 18 2024 Data_Holger_VRE
-rwxr-xr-x 1 jhuang jhuang 128 Jun 18 2024 Data_Holger_Pseudomonas_aeruginosa_SNP
-rwxr-xr-x 1 jhuang jhuang 92 Jun 18 2024 Data_Hannes_ChIPSeq
-rwxr-xr-x 1 jhuang jhuang 76 Jun 18 2024 Data_Emilia_MeDIP
-rwxr-xr-x 1 jhuang jhuang 88 Jun 18 2024 Data_ChristophFR_HepE_published
-rwxr-xr-x 1 jhuang jhuang 158 Jun 18 2024 Data_Christopher_MeDIP_MMc_published
-rwxr-xr-x 1 jhuang jhuang 104 Jun 18 2024 Data_Anna_Kieler_Sepi_Staemme
-rwxr-xr-x 1 jhuang jhuang 136 Jun 18 2024 Data_Anna12_HAPDICS_final
-rwxr-xr-x 1 jhuang jhuang 96 Jun 18 2024 Data_Anastasia_RNASeq_PUBLISHING
-rwxr-xr-x 1 jhuang jhuang 169K Jun 18 2024 Aufnahmeantrag_komplett_10_2022.pdf
-rwxr-xr-x 1 jhuang jhuang 1,2M Jun 18 2024 Astrovirus.pdf
-rwxr-xr-x 1 jhuang jhuang 732 Jun 18 2024 COMMANDS
-rwxr-xr-x 1 jhuang jhuang 690 Jun 18 2024 Bacterial_pipelines.txt
-rwxr-xr-x 1 jhuang jhuang 16M Jun 18 2024 COMPSRA_uke_DEL.jar
-rwxr-xr-x 1 jhuang jhuang 239K Jun 18 2024 ChIPSeq_pipeline_desc.docx
-rwxr-xr-x 1 jhuang jhuang 385K Jun 18 2024 ChIPSeq_pipeline_desc.pdf
-rwxr-xr-x 1 jhuang jhuang 2,1M Jun 18 2024 Comparative_genomic_analysis_of_eight_novel_haloal.pdf
-rwxr-xr-x 1 jhuang jhuang 64K Jun 18 2024 CvO_Klassenliste_7_3.pdf
-rwxr-xr-x 1 jhuang jhuang 649K Jun 18 2024 'Copy of pool_b1_CGATGT_300.xlsx'
-rwxr-xr-x 1 jhuang jhuang 3,9K Jun 18 2024 Fran_16S_Exp8-17-21-27.txt
-rwxr-xr-x 1 jhuang jhuang 463 Jun 18 2024 HPI_DRIVE
-rwxr-xr-x 1 jhuang jhuang 179K Jun 18 2024 HEV_aligned.fasta
-rwxr-xr-x 1 jhuang jhuang 4,1K Jun 18 2024 INTENSO_DIR
-rwxr-xr-x 1 jhuang jhuang 14K Jun 18 2024 HPI_samples_for_NGS_29.09.22.xlsx
-rwxr-xr-x 1 jhuang jhuang 4,3K Jun 18 2024 Hotmail_to_Gmail
-rwxr-xr-x 1 jhuang jhuang 13M Jun 18 2024 Indra_Thesis_161020.pdf
-rwxr-xr-x 1 jhuang jhuang 5,2K Jun 18 2024 'LT K331A.gbk'
-rwxr-xr-x 1 jhuang jhuang 0 Jun 18 2024 LOG_p954_stat
-rwxr-xr-x 1 jhuang jhuang 684K Jun 18 2024 LOG
-rwxr-xr-x 1 jhuang jhuang 197K Jun 18 2024 Manuscript_10_02_2021.docx
-rwxr-xr-x 1 jhuang jhuang 595K Jun 18 2024 Metagenomics_Tools_and_Insights.pdf
-rwxr-xr-x 1 jhuang jhuang 14K Jun 18 2024 'Miseq Amplikon LAuf April.xlsx'
-rwxr-xr-x 1 jhuang jhuang 2,2M Jun 18 2024 NGS.tar.gz
-rwxr-xr-x 1 jhuang jhuang 586K Jun 18 2024 Nachweis_Bakterien_Viren_im_Hochdurchsatz.pdf
-rwxr-xr-x 1 jhuang jhuang 1,2K Jun 18 2024 Nicole8_Lamprecht_logs
-rwxr-xr-x 1 jhuang jhuang 24M Jun 18 2024 Nanopore.handouts.pdf
-rwxr-xr-x 1 jhuang jhuang 113K Jun 18 2024 'Norovirus paper Susanne 191105.docx'
-rwxr-xr-x 1 jhuang jhuang 503K Jun 18 2024 PhyloRNAalifold.pdf
-rwxr-xr-x 1 jhuang jhuang 19K Jun 18 2024 README_R
-rwxr-xr-x 1 jhuang jhuang 137K Jun 18 2024 README_RNAHiSwitch_DEL
-rwxr-xr-x 1 jhuang jhuang 8,3M Jun 18 2024 RNA-NGS_Analysis_modul3_NanoStringNorm.zip
-rwxr-xr-x 1 jhuang jhuang 57K Jun 18 2024 RNAConSLOptV1.2.tar.gz
-rwxr-xr-x 1 jhuang jhuang 17K Jun 18 2024 'RSV GFP5 including 3`UTR.docx'
-rwxr-xr-x 1 jhuang jhuang 238 Jun 18 2024 SNPs_on_pangenome.txt
-rwxr-xr-x 1 jhuang jhuang 55 Jun 18 2024 SERVER
-rwxr-xr-x 1 jhuang jhuang 26M Jun 18 2024 R_tutorials-master.zip
-rwxr-xr-x 1 jhuang jhuang 182K Jun 18 2024 Rawdata_Readme.pdf
-rwxr-xr-x 1 jhuang jhuang 40K Jun 18 2024 SUB10826945_record_preview.txt
-rwxr-xr-x 1 jhuang jhuang 283K Jun 18 2024 S_staphylococcus_annotated_diff_expr.xls
-rwxr-xr-x 1 jhuang jhuang 2,0K Jun 18 2024 Snakefile_list
-rwxr-xr-x 1 jhuang jhuang 160K Jun 18 2024 Source_Classification_Code.rds
-rwxr-xr-x 1 jhuang jhuang 61K Jun 18 2024 Supplementary_Table_S3.xlsx
-rwxr-xr-x 1 jhuang jhuang 617 Jun 18 2024 Untitled.ipynb
-rwxr-xr-x 1 jhuang jhuang 127M Jun 18 2024 UniproUGENE_UserManual.pdf
-rwxr-xr-x 1 jhuang jhuang 14M Jun 18 2024 Untitled1.ipynb
-rwxr-xr-x 1 jhuang jhuang 110K Jun 18 2024 Untitled2.ipynb
-rwxr-xr-x 1 jhuang jhuang 2,9K Jun 18 2024 Untitled3.ipynb
-rwxr-xr-x 1 jhuang jhuang 18K Jun 18 2024 WAC6h_vs_WAP6h_down.txt
-rwxr-xr-x 1 jhuang jhuang 100 Jun 18 2024 damian_nodbs
-rwxr-xr-x 1 jhuang jhuang 45K Jun 18 2024 WAC6h_vs_WAP6h_up.txt
-rwxr-xr-x 1 jhuang jhuang 635K Jun 18 2024 'add. Figures Hamburg_UKE.pptx'
-rwxr-xr-x 1 jhuang jhuang 3,7M Jun 18 2024 all_gene_counts_with_annotation.xlsx
-rwxr-xr-x 1 jhuang jhuang 22K Jun 18 2024 app_flask.py
-rwxr-xr-x 1 jhuang jhuang 1,8K Jun 18 2024 bengal-bay-0.1.json
-rwxr-xr-x 1 jhuang jhuang 16K Jun 18 2024 bengal3_ac3.yml
-rwxr-xr-x 1 jhuang jhuang 246K Jun 18 2024 call_shell_from_Ruby.png
-rwxr-xr-x 1 jhuang jhuang 8,1K Jun 18 2024 bengal3_ac3_.yml
-rwxr-xr-x 1 jhuang jhuang 12 Jun 18 2024 empty.fasta
-rwxr-xr-x 1 jhuang jhuang 32K Jun 18 2024 coefficients_csaw_vs_diffreps.xlsx
-rwxr-xr-x 1 jhuang jhuang 4,3K Jun 18 2024 exchange.txt
-rwxr-xr-x 1 jhuang jhuang 30M Jun 18 2024 exdata-data-NEI_data.zip
-rwxr-xr-x 1 jhuang jhuang 6,6K Jun 18 2024 genes_wac6_wap6.xls
-rwxr-xr-x 1 jhuang jhuang 115M Jun 18 2024 go1.13.linux-amd64.tar.gz.1
-rwxr-xr-x 1 jhuang jhuang 29K Jun 18 2024 hev_p2-p5.fa
-rwxr-xr-x 1 jhuang jhuang 3,8K Jun 18 2024 map_corrected_backup.txt
-rwxr-xr-x 1 jhuang jhuang 325 Jun 18 2024 install_nginx_on_hamm
-rwxr-xr-x 1 jhuang jhuang 20M Jun 18 2024 hg19.rmsk.bed
-rwxr-xr-x 1 jhuang jhuang 107K Jun 18 2024 metadata-9563675-processed-ok.tsv
-rwxr-xr-x 1 jhuang jhuang 288K Jun 18 2024 mkg_sprechstundenflyer_ver1b_dezember_2019.pdf
-rwxr-xr-x 1 jhuang jhuang 588 Jun 18 2024 multiqc_config.yaml
-rwxr-xr-x 1 jhuang jhuang 38K Jun 18 2024 p11326_OMIKRON3398_corsurv.gb
-rwxr-xr-x 1 jhuang jhuang 30K Jun 18 2024 p11326_OMIKRON3398_corsurv.gb_converted.fna
-rwxr-xr-x 1 jhuang jhuang 3,9K Jun 18 2024 parseGenbank_reformat.py
-rwxr-xr-x 1 jhuang jhuang 222K Jun 18 2024 pangenome-snakemake-master.zip
-rwxr-xr-x 1 jhuang jhuang 283K Jun 18 2024 'phylo tree draft.pdf'
-rwxr-xr-x 1 jhuang jhuang 125 Jun 18 2024 qiime_params.txt
-rwxr-xr-x 1 jhuang jhuang 2,3M Jun 18 2024 pool_b1_CGATGT_300.zip
-rwxr-xr-x 1 jhuang jhuang 5,5K Jun 18 2024 qiime_params_backup.txt
-rwxr-xr-x 1 jhuang jhuang 4,5K Jun 18 2024 qiime_params_s16_s18.txt
-rwxr-xr-x 1 jhuang jhuang 68 Jun 18 2024 snakePipes
-rwxr-xr-x 1 jhuang jhuang 25K Jun 18 2024 results_description.html
-rwxr-xr-x 1 jhuang jhuang 139M Jun 18 2024 rnaalihishapes.tar.gz
-rwxr-xr-x 1 jhuang jhuang 3,4M Jun 18 2024 rnaseq_length_bias.pdf
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 3932-Leber
drwxr-xr-x 6 jhuang jhuang 4,0K Jun 18 2024 BioPython
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Biopython
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 DEEP-DV
drwxr-xr-x 13 jhuang jhuang 4,0K Jun 18 2024 DOKTORARBEIT
drwxr-xr-x 17 jhuang jhuang 4,0K Jun 18 2024 Data_16S_Arck_vaginal_stool
drwxr-xr-x 22 jhuang jhuang 4,0K Jun 18 2024 Data_16S_BS052
drwxr-xr-x 13 jhuang jhuang 4,0K Jun 18 2024 Data_16S_Birgit
drwxr-xr-x 3 jhuang jhuang 4,0K Jun 18 2024 Data_16S_Christner
drwxr-xr-x 9 jhuang jhuang 4,0K Jun 18 2024 Data_16S_Leonie
drwxr-xr-x 11 jhuang jhuang 4,0K Jun 18 2024 Data_16S_PatientA-G_CSF
drwxr-xr-x 14 jhuang jhuang 4,0K Jun 18 2024 Data_16S_Schaltenberg
drwxr-xr-x 7 jhuang jhuang 4,0K Jun 18 2024 Data_16S_benchmark
drwxr-xr-x 7 jhuang jhuang 4,0K Jun 18 2024 Data_16S_benchmark2
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_16S_gcdh_BKV
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Alex1_Amplicon
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Alex1_SNP
drwxr-xr-x 5 jhuang jhuang 4,0K Jun 18 2024 Data_Analysis_for_Life_Science
drwxr-xr-x 19 jhuang jhuang 4,0K Jun 18 2024 Data_Anna13_vanA-Element
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Anna14_PACBIO_methylation
drwxr-xr-x 8 jhuang jhuang 4,0K Jun 18 2024 Data_Anna_C.acnes2_old_DEL
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Anna_MT880872_update
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Anna_gap_filling_agrC
drwxr-xr-x 4 jhuang jhuang 4,0K Jun 18 2024 Data_Baechlein_Hepacivirus_2018
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Bornavirus
drwxr-xr-x 3 jhuang jhuang 4,0K Jun 18 2024 Data_CSF
drwxr-xr-x 9 jhuang jhuang 4,0K Jun 18 2024 Data_Christine_cz19-178-rothirsch-bovines-hepacivirus
drwxr-xr-x 4 jhuang jhuang 4,0K Jun 18 2024 Data_Daniela_adenovirus_WGS
drwxr-xr-x 3 jhuang jhuang 4,0K Jun 18 2024 Data_Emilia_MeDIP_DEL
drwxr-xr-x 14 jhuang jhuang 4,0K Jun 18 2024 Data_Francesco2021_16S
drwxr-xr-x 9 jhuang jhuang 4,0K Jun 18 2024 Data_Francesco2021_16S_re
drwxr-xr-x 3 jhuang jhuang 4,0K Jun 18 2024 Data_Gunnar_MS
drwxr-xr-x 10 jhuang jhuang 4,0K Jun 18 2024 Data_Hannes_RNASeq
drwxr-xr-x 29 jhuang jhuang 4,0K Jun 18 2024 Data_Holger_Efaecium_variants_PUBLISHED
drwxr-xr-x 5 jhuang jhuang 4,0K Jun 18 2024 Data_Holger_VRE_DEL
drwxr-xr-x 3 jhuang jhuang 4,0K Jun 18 2024 Data_Icebear_Damian
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Indra3_H3K4_p2_DEL
drwxr-xr-x 4 jhuang jhuang 4,0K Jun 18 2024 Data_Indra6_RNASeq_ChipSeq_Integration_DEL
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Indra_Figures
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_KatjaGiersch_new_HDV
drwxr-xr-x 3 jhuang jhuang 4,0K Jun 18 2024 Data_MHH_Encephalitits_DAMIAN
drwxr-xr-x 6 jhuang jhuang 4,0K Jun 18 2024 Data_Manja_RPAChIPSeq_public
drwxr-xr-x 72 jhuang jhuang 12K Jun 18 2024 Data_Manuel_WGS_Yersinia
drwxr-xr-x 32 jhuang jhuang 4,0K Jun 18 2024 Data_Manuel_WGS_Yersinia2_DEL
drwxr-xr-x 4 jhuang jhuang 4,0K Jun 18 2024 Data_Manuel_WGS_Yersinia_DEL
drwxr-xr-x 13 jhuang jhuang 4,0K Jun 18 2024 Data_Marcus_tracrRNA_structures
drwxr-xr-x 5 jhuang jhuang 4,0K Jun 18 2024 Data_Mausmaki_Damian
drwxr-xr-x 4 jhuang jhuang 4,0K Jun 18 2024 Data_Nicole1_Tropheryma_whipplei
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Nicole5
drwxr-xr-x 6 jhuang jhuang 4,0K Jun 18 2024 Data_Nicole5_77-92
drwxr-xr-x 3 jhuang jhuang 4,0K Jun 18 2024 Data_PaulBecher_Rotavirus
drwxr-xr-x 21 jhuang jhuang 4,0K Jun 18 2024 Data_Pietschmann_HCV_Amplicon_bigFile
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Piscine_Orthoreovirus_3_in_Brown_Trout
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Proteomics
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_RNABioinformatics
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_RNAKinetics
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_R_courses
drwxr-xr-x 5 jhuang jhuang 4,0K Jun 18 2024 Data_SARS-CoV-2
drwxr-xr-x 9 jhuang jhuang 4,0K Jun 18 2024 Data_SARS-CoV-2_Genome_Announcement_PUBLISHED
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Seite
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Song_aggregate_sum
drwxr-xr-x 3 jhuang jhuang 4,0K Jun 18 2024 Data_Susanne_Amplicon_RdRp_orf1_2_re
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Tabea_RNASeq
drwxr-xr-x 5 jhuang jhuang 4,0K Jun 18 2024 Data_Thaiss1_Microarray_new
drwxr-xr-x 10 jhuang jhuang 4,0K Jun 18 2024 Data_Tintelnot_16S
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Wuenee_Plots
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Yang_Poster
drwxr-xr-x 4 jhuang jhuang 4,0K Jun 18 2024 Data_jupnote
drwxr-xr-x 21 jhuang jhuang 4,0K Jun 18 2024 Data_parainfluenza
drwxr-xr-x 9 jhuang jhuang 4,0K Jun 18 2024 Data_snakemake_recipe
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_temp
drwxr-xr-x 3 jhuang jhuang 4,0K Jun 18 2024 Data_viGEN
drwxr-xr-x 19 jhuang jhuang 4,0K Jun 18 2024 Genomic_Data_Science
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Learn_UGENE
drwxr-xr-x 5 jhuang jhuang 4,0K Jun 18 2024 MMcPaper
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Manuscript_Epigenetics_Macrophage_Yersinia
drwxr-xr-x 6 jhuang jhuang 4,0K Jun 18 2024 Manuscript_RNAHiSwitch
drwxr-xr-x 4 jhuang jhuang 4,0K Jun 18 2024 MeDIP_Emilia_copy_DEL
drwxr-xr-x 5 jhuang jhuang 4,0K Jun 18 2024 Method_biopython
drwxr-xr-x 4 jhuang jhuang 4,0K Jun 18 2024 NGS
drwxr-xr-x 5 jhuang jhuang 4,0K Jun 18 2024 Okazaki-Seq_Processing
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 RNA-NGS_Analysis_modul3_NanoStringNorm
drwxr-xr-x 5 jhuang jhuang 4,0K Jun 18 2024 RNAConSLOptV1.2
drwxr-xr-x 9 jhuang jhuang 4,0K Jun 18 2024 RNAHeliCes
drwxr-xr-x 11 jhuang jhuang 4,0K Jun 18 2024 RNA_li_HeliCes
drwxr-xr-x 10 jhuang jhuang 4,0K Jun 18 2024 RNAliHeliCes
drwxr-xr-x 10 jhuang jhuang 4,0K Jun 18 2024 RNAliHeliCes_Relatedshapes_modified
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 R_refcard
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 R_DataCamp
drwxr-xr-x 3 jhuang jhuang 4,0K Jun 18 2024 R_cats_package
drwxr-xr-x 9 jhuang jhuang 4,0K Jun 18 2024 R_tutorials-master
drwxr-xr-x 7 jhuang jhuang 4,0K Jun 18 2024 SnakeChunks
drwxr-xr-x 6 jhuang jhuang 4,0K Jun 18 2024 align_4l_on_FJ705359
drwxr-xr-x 5 jhuang jhuang 4,0K Jun 18 2024 align_4p_on_FJ705359
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 assembly
drwxr-xr-x 6 jhuang jhuang 4,0K Jun 18 2024 bacto
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 bam2fastq_mapping_again
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 chipster
drwxr-xr-x 5 jhuang jhuang 4,0K Jun 18 2024 damian_GUI
drwxr-xr-x 4 jhuang jhuang 4,0K Jun 18 2024 enhancer-snakemake-demo
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 hg19_gene_annotations
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 interlab_comparison_DEL
drwxr-xr-x 5 jhuang jhuang 4,0K Jun 18 2024 my_flask
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 papers
drwxr-xr-x 6 jhuang jhuang 4,0K Jun 18 2024 pangenome-snakemake_zhaoc1
drwxr-xr-x 14 jhuang jhuang 4,0K Jun 18 2024 pyflow-epilogos
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 raw_data_rnaseq_Indra
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 test_raw_data_dnaseq
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 test_raw_data_rnaseq
drwxr-xr-x 6 jhuang jhuang 4,0K Jun 18 2024 to_Francesco
drwxr-xr-x 36 jhuang jhuang 4,0K Jun 18 2024 ukepipe
drwxr-xr-x 15 jhuang jhuang 4,0K Jun 18 2024 ukepipe_nf
drwxr-xr-x 17 jhuang jhuang 4,0K Jun 18 2024 var_www_DjangoApp_mysite2_2023-05
-rwxr-xr-x 1 jhuang jhuang 59K Jun 18 2024 roentgenpass.pdf
-rwxr-xr-x 1 jhuang jhuang 9,1M Jun 18 2024 salmon_tx2gene_GRCh38.tsv
-rwxr-xr-x 1 jhuang jhuang 4,1K Jun 18 2024 salmon_tx2gene_chrHsv1.tsv
-rwxr-xr-x 1 jhuang jhuang 8,9K Jun 18 2024 'sample IDs_Lamprecht.xlsx'
-rwxr-xr-x 1 jhuang jhuang 30M Jun 18 2024 summarySCC_PM25.rds
-rwxr-xr-x 1 jhuang jhuang 0 Jun 18 2024 untitled.py
-rwxr-xr-x 1 jhuang jhuang 11M Jun 18 2024 tutorial-rnaseq.pdf
-rwxr-xr-x 1 jhuang jhuang 1,3K Jun 18 2024 x.log
-rwxr-xr-x 1 jhuang jhuang 381M Jun 18 2024 webapp.tar.gz
-rw-rw-r-- 1 jhuang jhuang 8,4K Okt 9 2024 temp
-rw-rw-r-- 1 jhuang jhuang 2,7K Okt 9 2024 temp2
drwxr-xr-x 51 jhuang jhuang 12K Feb 17 12:23 Data_Susanne_Amplicon_haplotype_analyses_RdRp_orf1_2_re
drwxr-xr-x 6 jhuang jhuang 4,0K Feb 17 12:42 Data_Susanne_WGS_unbiased
jhuang@WS-2290C:~/DATA_D$ ls -tlrh
total 56K
lrwxrwxrwx 1 jhuang jhuang 59 Apr 11 2024 Data_Soeren_RNA-seq_2023_PUBLISHING -> /media/jhuang/Elements/Data_Soeren_RNA-seq_2023_PUBLISHING/
lrwxrwxrwx 1 jhuang jhuang 32 Apr 11 2024 Data_Ute -> /media/jhuang/Elements/Data_Ute/
lrwxrwxrwx 1 jhuang jhuang 52 Apr 23 2024 Data_Marc_RNA-seq_Sepidermidis -> /media/jhuang/Titisee/Data_Marc_RNA-seq_Sepidermidis
drwxrwxr-x 2 jhuang jhuang 4,0K Mai 2 2024 Data_Patricia_Transposon
drwxrwxr-x 2 jhuang jhuang 4,0K Mai 29 2024 Books_DA_for_Life
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 18 2024 Data_Sven
-rw-rw-r-- 1 jhuang jhuang 2,9K Jul 16 2024 Datasize_calculation_based_on_coverage.txt
drwxr-xr-x 6 jhuang jhuang 4,0K Jul 23 2024 Data_Paul_HD46_1-wt_resequencing
drwxrwxr-x 2 jhuang jhuang 4,0K Jul 26 2024 Data_Sanam_DAMIAN
drwxrwxr-x 26 jhuang jhuang 12K Jul 30 2024 Data_Tam_variant_calling
drwxrwxr-x 2 jhuang jhuang 4,0K Aug 26 2024 Data_Samira_Manuscripts
drwxrwxr-x 2 jhuang jhuang 4,0K Aug 27 2024 Data_Silvia_VoltRon_Debug
drwxrwxr-x 38 jhuang jhuang 4,0K Jun 10 2025 Data_Pietschmann_229ECoronavirus_Mutations_2024
drwxrwxr-x 23 jhuang jhuang 4,0K Jun 25 2025 Data_Pietschmann_229ECoronavirus_Mutations_2025
lrwxrwxrwx 1 jhuang jhuang 63 Nov 24 16:30 Data_Birthe_Svenja_RSV_Probe3_PUBLISHING -> /media/jhuang/Elements/Data_Birthe_Svenja_RSV_Probe3_PUBLISHING
jhuang@WS-2290C:~/DATA_E$ ls -tlrh
total 119M
drwxr-xr-x 10 jhuang jhuang 4,0K Apr 18 2019 j_huang_until_201904
drwxr-xr-x 2 jhuang jhuang 4,0K Apr 29 2019 Data_2019_April
drwxr-xr-x 2 jhuang jhuang 4,0K Mai 10 2019 Data_2019_May
drwxr-xr-x 2 jhuang jhuang 4,0K Jun 17 2019 Data_2019_June
drwxr-xr-x 2 jhuang jhuang 4,0K Jul 12 2019 Data_2019_July
drwxr-xr-x 3 jhuang jhuang 4,0K Aug 29 2019 Data_2019_August
drwxr-xr-x 3 jhuang jhuang 4,0K Sep 5 2019 Data_2019_September
drwxr-xr-x 11 jhuang jhuang 4,0K Apr 18 2023 Data_Song_RNASeq_PUBLISHED
drwxr-xr-x 7 jhuang jhuang 4,0K Okt 10 2023 Data_Laura_MP_RNASeq
drwxr-xr-x 22 jhuang jhuang 12K Nov 3 2023 Data_Nicole6_HEV_Swantje2
drwxr-xr-x 17 jhuang jhuang 4,0K Nov 13 2023 Data_Becher_Damian_Picornavirus_BovHepV
-rwxr-xr-x 1 jhuang jhuang 118M Nov 28 2023 bacteria_refseq.zip
drwxr-xr-x 3 jhuang jhuang 4,0K Nov 30 2023 bacteria_refseq
drwxr-xr-x 8 jhuang jhuang 4,0K Nov 30 2023 Data_Rotavirus
drwxr-xr-x 6 jhuang jhuang 4,0K Dez 6 2023 Data_Xiaobo_10x
drwx------ 17 jhuang jhuang 4,0K Feb 7 2025 Data_Becher_Damian_Picornavirus_BovHepV_INCOMPLETE_DEL
jhuang@WS-2290C:~/DATA_Intenso$ ls -ltrh
total 4,1G
drwxr-xr-x 15 jhuang jhuang 4,0K Mär 30 2015 HOME_FREIBURG_DEL
drwxr-xr-x 2 jhuang jhuang 4,0K Aug 12 2015 150810_M03701_0019_000000000-AFJFK
drwxr-xr-x 5 jhuang jhuang 4,0K Jan 31 2017 Data_Thaiss2_Microarray
drwxr-xr-x 9 jhuang jhuang 4,0K Apr 27 2017 VirtualBox_VMs_DEL
drwxr-xr-x 7 jhuang jhuang 4,0K Apr 27 2017 'VirtualBox VMs_DEL'
drwxr-xr-x 7 jhuang jhuang 4,0K Apr 27 2017 'VirtualBox VMs2_DEL'
drwxr-xr-x 16 jhuang jhuang 4,0K Mai 12 2017 websites
drwxr-xr-x 5 jhuang jhuang 4,0K Jun 29 2017 DATA
drwxr-xr-x 149 jhuang jhuang 36K Jun 30 2017 Data_Laura
drwxr-xr-x 149 jhuang jhuang 36K Jun 30 2017 Data_Laura_2
drwxr-xr-x 3 jhuang jhuang 4,0K Jun 30 2017 Data_Laura_3
drwxr-xr-x 7 jhuang jhuang 4,0K Jul 10 2017 galaxy_tools
drwxr-xr-x 45 jhuang jhuang 32K Jul 17 2017 Downloads2
drwxr-xr-x 3 jhuang jhuang 4,0K Jul 27 2017 Downloads
drwxr-xr-x 3 jhuang jhuang 4,0K Jul 28 2017 mom-baby_com_cn
drwxr-xr-x 3 jhuang jhuang 4,0K Aug 8 2017 'VirtualBox VMs2'
drwxr-xr-x 3 jhuang jhuang 4,0K Aug 9 2017 VirtualBox_VMs
drwxr-xr-x 3 jhuang jhuang 4,0K Aug 11 2017 CLC_Data
drwxr-xr-x 6 jhuang jhuang 12K Aug 14 2017 Work_Dir2
drwxr-xr-x 7 jhuang jhuang 4,0K Aug 15 2017 Work_Dir2_SGE
drwxr-xr-x 19 jhuang jhuang 4,0K Aug 24 2017 Data_SPANDx1_Kpneumoniae_vs_Assembly1
drwxr-xr-x 3 jhuang jhuang 4,0K Aug 24 2017 MauveOutput
drwxr-xr-x 3 jhuang jhuang 4,0K Aug 31 2017 Fastqs
drwxr-xr-x 20 jhuang jhuang 4,0K Sep 7 2017 Data_Anna3_VRE_Ausbruch
drwxr-xr-x 8 jhuang jhuang 4,0K Sep 19 2017 Work_Dir_mock_broad_mockinput
drwxr-xr-x 8 jhuang jhuang 4,0K Sep 19 2017 Work_Dir_dM_broad_mockinput
drwxr-xr-x 4 jhuang jhuang 4,0K Okt 6 2017 Data_Anna8_RNASeq_static_shake_deprecated
drwxr-xr-x 24 jhuang jhuang 4,0K Okt 9 2017 PENDRIVE_cont
drwxr-xr-x 8 jhuang jhuang 4,0K Okt 23 2017 Work_Dir_WAP_broad_mockinput
drwxr-xr-x 10 jhuang jhuang 4,0K Okt 23 2017 Work_Dir_WAC_broad_mockinput
drwxr-xr-x 11 jhuang jhuang 4,0K Okt 23 2017 Work_Dir_dP_broad_mockinput
drwxr-xr-x 52 jhuang jhuang 4,0K Nov 8 2017 Data_Nicole10_16S_interlab
drwxr-xr-x 6 jhuang jhuang 4,0K Dez 6 2017 PAPERS
drwxr-xr-x 14 jhuang jhuang 16K Dez 15 2017 TB
drwxr-xr-x 5 jhuang jhuang 4,0K Dez 19 2017 Data_Anna4_SNP
drwxr-xr-x 11 jhuang jhuang 4,0K Jan 16 2018 Data_Carolin1_16S
drwxr-xr-x 2 jhuang jhuang 4,0K Jan 22 2018 ChipSeq_Raw_Data3_171009_NB501882_0024_AHNGTYBGX3
-rw-r--r-- 1 jhuang jhuang 4,0G Jan 23 2018 m_aepfelbacher_DEL.zip
drwxr-xr-x 7 jhuang jhuang 4,0K Jan 24 2018 Data_Anna7_RNASeq_Cytoscape
drwxr-xr-x 3 jhuang jhuang 4,0K Jan 24 2018 Data_Nicole9_Hund_Katze_Mega
drwxr-xr-x 39 jhuang jhuang 20K Jan 28 2018 Data_Anna2_CO6114
drwxr-xr-x 3 jhuang jhuang 4,0K Jan 28 2018 Data_Nicole3_TH17_orig
drwxr-xr-x 27 jhuang jhuang 28K Jan 28 2018 Data_Nicole1_Tropheryma_whipplei
drwxr-xr-x 16 jhuang jhuang 4,0K Jan 30 2018 results_K27
drwxr-xr-x 2 jhuang jhuang 4,0K Feb 19 2018 'VirtualBox VMs'
drwxr-xr-x 28 jhuang jhuang 12K Feb 27 2018 Data_Anna6_RNASeq
drwxr-xr-x 17 jhuang jhuang 12K Mär 1 2018 Data_Anna1_1585_RNAseq
drwxr-xr-x 21 jhuang jhuang 4,0K Mär 7 2018 Data_Thaiss1_Microarray
drwxr-xr-x 25 jhuang jhuang 12K Mär 27 2018 Data_Nicole7_Anelloviruses_Polyomavirus
drwxr-xr-x 13 jhuang jhuang 4,0K Mai 22 2018 Data_Nina1_Nicole5_1-76
drwxr-xr-x 11 jhuang jhuang 4,0K Mai 22 2018 Data_Nina1_merged
drwxr-xr-x 32 jhuang jhuang 4,0K Jun 14 2018 Data_Nicole8_Lamprecht
drwxr-xr-x 40 jhuang jhuang 16K Jul 5 2018 Data_Anna5_SNP
drwxr-xr-x 35 jhuang jhuang 4,0K Okt 12 2018 chipseq
drwxr-xr-x 107 jhuang jhuang 76K Mai 18 2019 Downloads_DEL
drwxr-xr-x 7 jhuang jhuang 4,0K Mär 17 2020 Data_Gagliani2_enriched_16S
drwxr-xr-x 17 jhuang jhuang 4,0K Mär 17 2020 Data_Gagliani1_18S_16S
drwxr-xr-x 2 jhuang jhuang 4,0K Apr 2 2020 m_aepfelbacher
drwxr-xr-x 4 jhuang jhuang 4,0K Feb 17 12:38 Data_Susanne_WGS_3amplicons
jhuang@WS-2290C:/media/jhuang/Titisee$ ls -tlrh
total 3,5G
drwxrwxrwx 1 jhuang jhuang 0 Dez 19 2017 Data_Anna4_SNP
drwxrwxrwx 1 jhuang jhuang 4,0K Jan 24 2018 Data_Anna5_SNP_rsync_error
-rwxrwxrwx 1 jhuang jhuang 9,9K Mär 21 2018 TRASH
drwxrwxrwx 1 jhuang jhuang 20K Mär 28 2018 Data_Nicole6_HEV_4_SNP_calling_PE_DEL
drwxrwxrwx 1 jhuang jhuang 4,0K Mai 22 2018 Data_Nina1_Nicole7
drwxrwxrwx 1 jhuang jhuang 8,0K Mai 24 2018 Data_Nicole6_HEV_4_SNP_calling_SE_DEL
-rwxrwxrwx 1 jhuang jhuang 3,5G Jun 14 2018 180119_M03701_0115_000000000-BFG46.zip
drwxrwxrwx 1 jhuang jhuang 4,0K Jul 10 2018 Data_Nicole10_16S_interlab_PUBLISHED
drwxrwxrwx 1 jhuang jhuang 4,0K Jul 10 2018 Anna11_assemblies
drwxrwxrwx 1 jhuang jhuang 4,0K Jul 11 2018 Anna11_trees
drwxrwxrwx 1 jhuang jhuang 4,0K Jul 24 2018 Data_Nicole6_HEV_new_orig_fastqs
drwxrwxrwx 1 jhuang jhuang 4,0K Nov 23 2018 Data_Anna9_OXA-48_or_OXA-181
drwxrwxrwx 1 jhuang jhuang 4,0K Feb 15 2019 bengal_results_v1_2018
-rwxrwxrwx 1 jhuang jhuang 9,8M Mär 22 2019 DO.pdf
drwxrwxrwx 1 jhuang jhuang 4,0K Mai 6 2019 damian_DEL
drwxrwxrwx 1 jhuang jhuang 0 Mai 20 2019 MAGpy_db
drwxrwxrwx 1 jhuang jhuang 0 Aug 3 2019 UGENE_v1_32_data_cistrome
drwxrwxrwx 1 jhuang jhuang 4,0K Aug 3 2019 UGENE_v1_32_data_ngs_classification
drwxrwxrwx 1 jhuang jhuang 52K Okt 25 2019 Data_Nicole6_HEV_Swantje
drwxrwxrwx 1 jhuang jhuang 8,0K Okt 25 2019 Data_Nico_Gagliani
drwxrwxrwx 1 jhuang jhuang 4,0K Mär 30 2020 GAMOLA2_prototyp
drwxrwxrwx 1 jhuang jhuang 8,0K Mär 31 2020 Thomas_methylation_EPIC_DO
drwxrwxrwx 1 jhuang jhuang 8,0K Jun 15 2020 Data_Nicola_Schaltenberg
drwxrwxrwx 1 jhuang jhuang 36K Jun 25 2020 Data_Nicola_Schaltenberg_PICRUSt
drwxrwxrwx 1 jhuang jhuang 12K Jan 25 2021 HOME_FREIBURG
drwxrwxrwx 1 jhuang jhuang 4,0K Okt 13 2021 Data_Francesco_16S
drwxrwxrwx 1 jhuang jhuang 4,0K Jun 14 2022 3rd_party
drwxrwxrwx 1 jhuang jhuang 4,0K Jul 29 2022 ConsPred_prokaryotic_genome_annotation
drwxrwxrwx 1 jhuang jhuang 4,0K Aug 2 2022 'System Volume Information'
drwxrwxrwx 1 jhuang jhuang 0 Sep 16 2022 damian_v201016
drwxrwxrwx 1 jhuang jhuang 36K Jan 12 2023 Data_Holger_VRE
drwxrwxrwx 1 jhuang jhuang 32K Feb 1 2023 Data_Holger_Pseudomonas_aeruginosa_SNP
drwxrwxrwx 1 jhuang jhuang 4,0K Sep 5 2023 Eigene_Ordner_HR
drwxrwxrwx 1 jhuang jhuang 24K Sep 6 2023 GAMOLA2
drwxrwxrwx 1 jhuang jhuang 24K Sep 27 2023 Data_Anastasia_RNASeq
drwxrwxrwx 1 jhuang jhuang 24K Okt 20 2023 Data_Amir_PUBLISHED
drwxrwxrwx 1 jhuang jhuang 44K Apr 25 2024 Data_Marc_RNA-seq_Sepidermidis
drwxrwxrwx 1 jhuang jhuang 4,0K Sep 23 2024 '$RECYCLE.BIN'
drwxrwxrwx 1 jhuang jhuang 4,0K Sep 23 2024 Data_Xiaobo_10x_3
drwxrwxrwx 1 jhuang jhuang 24K Nov 28 2024 Data_Tam_DNAseq_2023_Comparative_ATCC19606_AYE_ATCC17978
drwxrwxrwx 1 jhuang jhuang 48K Dez 19 2024 Data_Holger_S.epidermidis_short
-rwxrwxrwx 1 jhuang jhuang 31 Feb 4 2025 TEMP
drwxrwxrwx 1 jhuang jhuang 12K Aug 22 11:44 Data_Holger_S.epidermidis_long
jhuang@WS-2290C:/media/jhuang/Elements(Denise_ChIPseq)$ ls -tlrh
total 11M
drwxr-xr-x 1 jhuang jhuang 4,0K Jun 7 2019 Data_Denise_LTtrunc_H3K27me3_2_results_DEL
drwxr-xr-x 1 jhuang jhuang 4,0K Jun 7 2019 Data_Denise_LTtrunc_H3K4me3_2_results_DEL
drwxr-xr-x 1 jhuang jhuang 28K Aug 26 2019 Data_Anna12_HAPDICS_final_not_finished_DEL
drwxr-xr-x 1 jhuang jhuang 4,0K Okt 24 2019 m_aepfelbacher_DEL
drwxr-xr-x 1 jhuang jhuang 20K Jan 14 2020 Data_Damian
drwxr-xr-x 1 jhuang jhuang 4,0K Jan 25 2020 ST772_DEL
drwxr-xr-x 1 jhuang jhuang 160K Jan 25 2020 ALL_trimmed_part_DEL
drwxr-xr-x 1 jhuang jhuang 0 Mär 30 2020 Data_Denise_ChIPSeq_Protocol1
drwxr-xr-x 1 jhuang jhuang 44K Mai 19 2020 Data_Pietschmann_HCV_Amplicon
drwxr-xr-x 1 jhuang jhuang 60K Jun 26 2020 Data_Nicole6_HEV_ownMethod_new
-rwxr-xr-x 1 jhuang jhuang 2,5M Aug 5 2020 HD04-1.fasta
drwxr-xr-x 1 jhuang jhuang 4,0K Mai 31 2021 RNAHiSwitch_
drwxr-xr-x 1 jhuang jhuang 4,0K Mai 31 2021 RNAHiSwitch__
drwxr-xr-x 1 jhuang jhuang 8,0K Jun 17 2021 RNAHiSwitch___
drwxr-xr-x 1 jhuang jhuang 4,0K Jun 25 2021 RNAHiSwitch_paper_
drwxr-xr-x 1 jhuang jhuang 0 Jul 7 2021 RNAHiSwitch_milestone1_DELETED
-rwxr-xr-x 1 jhuang jhuang 7,2M Jul 7 2021 RNAHiSwitch_paper.tar.gz
drwxr-xr-x 1 jhuang jhuang 4,0K Jul 12 2021 RNAHiSwitch_paper_DELETED
drwxr-xr-x 1 jhuang jhuang 12K Jul 12 2021 RNAHiSwitch_milestone1
drwxr-xr-x 1 jhuang jhuang 4,0K Aug 23 2021 RNAHiSwitch_paper
drwxr-xr-x 1 jhuang jhuang 4,0K Sep 24 2021 Ute_RNASeq_results
drwxr-xr-x 1 jhuang jhuang 4,0K Sep 24 2021 Ute_miRNA_results_38
drwxr-xr-x 1 jhuang jhuang 88K Okt 27 2021 RNAHiSwitch
drwxr-xr-x 1 jhuang jhuang 48K Mär 31 2022 Data_HepE_Freiburg_PUBLISHED
drwxr-xr-x 1 jhuang jhuang 4,0K Jun 1 2022 Data_INTENSO_2022-06
drwxr-xr-x 1 jhuang jhuang 0 Sep 14 2022 '$RECYCLE.BIN'
drwxr-xr-x 1 jhuang jhuang 4,0K Sep 14 2022 'System Volume Information'
drwxr-xr-x 1 jhuang jhuang 4,0K Dez 7 2022 Data_Anna_Mixta_hanseatica_PUBLISHED
-rwxr-xr-x 1 jhuang jhuang 33K Dez 9 2022 coi_disclosure.docx
drwxr-xr-x 1 jhuang jhuang 20K Feb 8 2023 Data_Jingang
drwxr-xr-x 1 jhuang jhuang 4,0K Mai 30 2023 Data_Arck_16S_MMc_PUBLISHED
drwxr-xr-x 1 jhuang jhuang 4,0K Jun 5 2023 Data_Laura_ChIPseq_GSE120945
drwxr-xr-x 1 jhuang jhuang 80K Jun 5 2023 Data_Nicole6_HEV_ownMethod
drwxr-xr-x 1 jhuang jhuang 8,0K Jul 5 2023 Data_Susanne_16S_re_UNPUBLISHED *
drwxr-xr-x 1 jhuang jhuang 4,0K Okt 12 2023 Data_Denise_ChIPSeq_Protocol2
drwxr-xr-x 1 jhuang jhuang 4,0K Okt 20 2023 Data_Caroline_RNAseq_wt_timecourse
drwxr-xr-x 1 jhuang jhuang 4,0K Okt 20 2023 Data_Caroline_RNAseq_brain_organoids
drwxr-xr-x 1 jhuang jhuang 20K Okt 20 2023 Data_Amir_PUBLISHED_DEL
drwxr-xr-x 1 jhuang jhuang 4,0K Nov 24 2023 Data_download_virus_fam
drwxr-xr-x 1 jhuang jhuang 12K Feb 22 2024 Data_Gunnar_Yersiniomics_COPYFAILED_DEL
drwxr-xr-x 1 jhuang jhuang 20K Feb 27 2024 Data_Paul_and_Marc_Epidome_batch3
-rwxr-xr-x 1 jhuang jhuang 3,0K Okt 30 2024 ifconfig_hamm.txt
drwxr-xr-x 1 jhuang jhuang 8,0K Apr 8 2025 Data_Soeren_2023_PUBLISHING
drwxr-xr-x 1 jhuang jhuang 28K Nov 24 13:34 Data_Birthe_Svenja_RSV_Probe3_PUBLISHING
drwxr-xr-x 1 jhuang jhuang 20K Jan 13 17:46 Data_Ute
drwxr-xr-x 1 jhuang jhuang 12K Feb 17 12:48 Data_Susanne_16S_UNPUBLISHED *
jhuang@WS-2290C:/media/jhuang/Seagate Expansion Drive(HOffice)$ ls -tlrh
total 19M
-rwxrwxrwx 1 jhuang jhuang 550K Jan 8 2015 SeagateExpansion.ico
-rwxrwxrwx 1 jhuang jhuang 38 Mär 27 2015 Autorun.inf
-rwxrwxrwx 2 jhuang jhuang 18M Mai 4 2017 Start_Here_Win.exe
-rwxrwxrwx 1 jhuang jhuang 1,1M Jul 7 2017 Warranty.pdf
drwxrwxrwx 1 jhuang jhuang 0 Jan 9 2018 Start_Here_Mac.app
drwxrwxrwx 1 jhuang jhuang 0 Jan 9 2018 Seagate
drwxrwxrwx 1 jhuang jhuang 0 Jun 5 2024 HomeOffice_DIR (Data_Anna_HAPDICS_RNASeq, From_Samsung_T5)
drwxrwxrwx 1 jhuang jhuang 4,0K Jun 17 2024 DATA_COPY_FROM_178528 (copy_and_clean.sh, logfile_jhuang.log, jhuang)
drwxrwxrwx 1 jhuang jhuang 0 Sep 9 10:41 'System Volume Information'
drwxrwxrwx 1 jhuang jhuang 0 Sep 9 10:41 '$RECYCLE.BIN'
jhuang@WS-2290C:/media/jhuang/Elements(Anna_C.arnes)$ ls -trlh
total 236K
drwxrwxrwx 1 jhuang jhuang 8,0K Nov 14 2018 Data_Swantje_HEV_using_viral-ngs
drwxrwxrwx 1 jhuang jhuang 0 Dez 4 2018 VIPER_static_DEL
drwxrwxrwx 1 jhuang jhuang 4,0K Apr 4 2019 Data_Nicole6_HEV_Swantje1_blood
drwxrwxrwx 1 jhuang jhuang 24K Apr 5 2019 Data_Nicole6_HEV_benchmark
drwxrwxrwx 1 jhuang jhuang 20K Mär 12 2020 Data_Denise_RNASeq_GSE79958
drwxrwxrwx 1 jhuang jhuang 8,0K Jan 11 2022 Data_16S_Leonie_from_Nico_Gaglianis
drwxrwxrwx 1 jhuang jhuang 8,0K Jul 29 2022 Fastqs_19-21
drwxrwxrwx 1 jhuang jhuang 4,0K Aug 2 2022 'System Volume Information'
drwxrwxrwx 1 jhuang jhuang 8,0K Sep 23 2022 Data_Luise_Epidome_test
drwxrwxrwx 1 jhuang jhuang 48K Sep 27 2023 Data_Anna_C.acnes_PUBLISHED
drwxrwxrwx 1 jhuang jhuang 24K Dez 6 2023 Data_Denise_LT_DNA_Bindung
drwxrwxrwx 1 jhuang jhuang 4,0K Jan 9 2024 Data_Denise_LT_K331A_RNASeq
drwxrwxrwx 1 jhuang jhuang 12K Jan 10 2024 Data_Luise_Epidome_batch1
drwxrwxrwx 1 jhuang jhuang 28K Feb 26 2024 Data_Luise_Pseudomonas_aeruginosa_PUBLISHED
drwxrwxrwx 1 jhuang jhuang 28K Feb 27 2024 Data_Luise_Epidome_batch2
drwxrwxrwx 1 jhuang jhuang 4,0K Sep 5 2024 picrust2_out_2024_2
drwxrwxrwx 1 jhuang jhuang 4,0K Mär 11 2025 '$RECYCLE.BIN'
jhuang@WS-2290C:/media/jhuang/Seagate Expansion Drive(DATA_COPY_FROM_hamburg)$ ls -tlrh
total 19M
-rwxrwxrwx 1 jhuang jhuang 33 Feb 21 2018 Autorun.inf
-rwxrwxrwx 2 jhuang jhuang 18M Jun 21 2019 Start_Here_Win.exe
-rwxrwxrwx 1 jhuang jhuang 1,6M Jul 6 2020 Warranty.pdf
drwxrwxrwx 1 jhuang jhuang 0 Mär 10 2021 Start_Here_Mac.app
drwxrwxrwx 1 jhuang jhuang 0 Mär 10 2021 Seagate
drwxrwxrwx 1 jhuang jhuang 12K Jun 29 2022 DATA_COPY_TRANSFER_INCOMPLETE_DEL
drwxrwxrwx 1 jhuang jhuang 4,0K Dez 16 2024 DATA_COPY_FROM_hamburg
jhuang@WS-2290C:/media/jhuang/Seagate Expansion Drive(Seagate_1)$ ls -trlh
total 104G
drwxr-xr-x 1 jhuang jhuang 4,0K Okt 3 2013 RNA_seq_analysis_tools_2013
drwxr-xr-x 1 jhuang jhuang 0 Feb 28 2018 Data_Laura0
drwxr-xr-x 1 jhuang jhuang 8,0K Sep 6 2018 Data_Petra_Arck
drwxr-xr-x 1 jhuang jhuang 4,0K Sep 14 2018 Data_Martin_mycoplasma
drwxr-xr-x 1 jhuang jhuang 8,0K Dez 5 2018 chromhmm-enhancers
drwxr-xr-x 1 jhuang jhuang 4,0K Jan 15 2019 ChromHMM_Dir
drwxr-xr-x 1 jhuang jhuang 4,0K Jan 18 2019 Data_Denise_sT_H3K4me3
drwxr-xr-x 1 jhuang jhuang 4,0K Jan 18 2019 Data_Denise_sT_H3K27me3
drwxr-xr-x 1 jhuang jhuang 0 Feb 13 2019 Start_Here_Mac.app
drwxr-xr-x 1 jhuang jhuang 0 Feb 13 2019 Seagate
drwxr-xr-x 1 jhuang jhuang 4,0K Feb 19 2019 Data_Nicole16_parapoxvirus
-rwxr-xr-x 1 jhuang jhuang 39G Aug 20 2019 Project_h_rohde_Susanne_WGS_unbiased_DEL.zip
drwxr-xr-x 1 jhuang jhuang 4,0K Nov 11 2019 Data_Denise_ChIPSeq_Protocol1
drwxr-xr-x 1 jhuang jhuang 8,0K Nov 13 2019 Data_ENNGS_pathogen_detection_pipeline_comparison
drwxr-xr-x 1 jhuang jhuang 4,0K Feb 18 2020 j_huang_201904_202002
-rwxr-xr-x 1 jhuang jhuang 112 Mär 2 2020 Data_Laura_ChIPseq_GSE120945
drwxr-xr-x 1 jhuang jhuang 8,0K Mär 26 2020 batch_200314_incomplete
-rwxr-xr-x 1 jhuang jhuang 65G Mär 26 2020 m_aepfelbacher.zip
drwxr-xr-x 1 jhuang jhuang 0 Mär 26 2020 m_error_DEL
drwxr-xr-x 1 jhuang jhuang 4,0K Mär 28 2020 batch_200325
drwxr-xr-x 1 jhuang jhuang 4,0K Mär 28 2020 batch_200319
drwxr-xr-x 1 jhuang jhuang 4,0K Mär 30 2020 GAMOLA2_prototyp
drwxr-xr-x 1 jhuang jhuang 4,0K Jun 22 2020 Data_Nicola_Gagliani
drwxr-xr-x 1 jhuang jhuang 4,0K Sep 3 2020 2017-18_raw_data
drwxr-xr-x 1 jhuang jhuang 1,2M Sep 11 2020 Data_Arck_MeDIP
drwxr-xr-x 1 jhuang jhuang 4,0K Okt 16 2020 trimmed
drwxr-xr-x 1 jhuang jhuang 4,0K Dez 23 2020 Data_Nicole_16S_Christmas_2020_2
drwxr-xr-x 1 jhuang jhuang 4,0K Jan 14 2021 j_huang_202007_202012
drwxr-xr-x 1 jhuang jhuang 4,0K Jan 15 2021 Data_Nicole_16S_Christmas_2020
drwxr-xr-x 1 jhuang jhuang 184K Jan 18 2021 Downloads_2021-01-18_DEL
drwxr-xr-x 1 jhuang jhuang 4,0K Jan 28 2021 Data_Laura_plasmid
drwxr-xr-x 1 jhuang jhuang 4,0K Mär 18 2021 Data_Laura_16S_2_re
drwxr-xr-x 1 jhuang jhuang 8,0K Mär 22 2021 Data_Laura_16S_2
drwxr-xr-x 1 jhuang jhuang 4,0K Mär 22 2021 Data_Laura_16S_2_re_
drwxr-xr-x 1 jhuang jhuang 8,0K Mär 23 2021 Data_Laura_16S_merged
drwxr-xr-x 1 jhuang jhuang 32K Nov 7 2022 Downloads_DEL
drwxr-xr-x 1 jhuang jhuang 12K Nov 7 2022 Data_Laura_16S
drwxr-xr-x 1 jhuang jhuang 76K Nov 9 2023 Data_Anna12_HAPDICS_final
drwxr-xr-x 1 jhuang jhuang 0 Dez 4 2023 '$RECYCLE.BIN'
drwxr-xr-x 1 jhuang jhuang 4,0K Dez 4 2023 'System Volume Information'
jhuang@WS-2290C:/media/jhuang/Seagate Expansion Drive(Seagate_2)$ ls -trlh
total 70G
drwxr-xr-x 1 jhuang jhuang 4,0K Jan 5 2017 Data_Nicole4_TH17
-rwxr-xr-x 1 jhuang jhuang 18M Feb 9 2018 Start_Here_Win.exe
-rwxr-xr-x 1 jhuang jhuang 33 Feb 21 2018 Autorun.inf
-rwxr-xr-x 1 jhuang jhuang 1,2M Jul 26 2018 Warranty.pdf
drwxr-xr-x 1 jhuang jhuang 0 Feb 13 2019 Start_Here_Mac.app
drwxr-xr-x 1 jhuang jhuang 0 Feb 13 2019 Seagate
drwxr-xr-x 1 jhuang jhuang 4,0K Dez 20 2019 Data_Denise_RNASeq_trimmed_DEL
drwxr-xr-x 1 jhuang jhuang 4,0K Jan 25 2020 HD12
drwxr-xr-x 1 jhuang jhuang 4,0K Jan 25 2020 Qi_panGenome
drwxr-xr-x 1 jhuang jhuang 44K Jan 25 2020 ALL
drwxr-xr-x 1 jhuang jhuang 0 Feb 14 2020 fastq_HPI_bw_2019_08_and_2020_02
-rwxr-xr-x 1 jhuang jhuang 19K Mär 12 2020 f1_R1_link.sh
-rwxr-xr-x 1 jhuang jhuang 19K Mär 12 2020 f1_R2_link.sh
drwxr-xr-x 1 jhuang jhuang 28K Mär 19 2020 rtpd_files
-rwxr-xr-x 1 jhuang jhuang 65G Apr 2 2020 m_aepfelbacher.zip
drwxr-xr-x 1 jhuang jhuang 4,0K Apr 20 2020 Data_Nicole_16S_Hamburg_Odense_Cornell_Muenster
drwxr-xr-x 1 jhuang jhuang 8,0K Apr 21 2020 HyAsP_incomplete_genomes
drwxr-xr-x 1 jhuang jhuang 4,0K Apr 25 2020 HyAsP_normal_sampled_input
drwxr-xr-x 1 jhuang jhuang 8,0K Apr 28 2020 HyAsP_complete_genomes
-rwxr-xr-x 1 jhuang jhuang 176M Mai 8 2020 video.zip
-rwxr-xr-x 1 jhuang jhuang 6,9K Jun 2 2020 sam2bedgff.pl
-rwxr-xr-x 1 jhuang jhuang 5,5K Jul 7 2020 HD04.infection.hS_vs_HD04.nose.hS_annotated_degenes.xls
drwxr-xr-x 1 jhuang jhuang 44K Jul 9 2020 ALL83
drwxr-xr-x 1 jhuang jhuang 20K Jul 9 2020 Data_Pietschmann_RSV_Probe_PUBLISHED
drwxr-xr-x 1 jhuang jhuang 8,0K Jul 27 2020 HyAsP_normal
drwxr-xr-x 1 jhuang jhuang 4,0K Jul 28 2020 Data_Manthey_16S
drwxr-xr-x 1 jhuang jhuang 8,0K Jul 29 2020 rtpd_files_DEL
drwxr-xr-x 1 jhuang jhuang 20K Aug 11 2020 HyAsP_bold
drwxr-xr-x 1 jhuang jhuang 44K Aug 17 2020 Data_HEV
drwxr-xr-x 1 jhuang jhuang 4,0K Sep 29 2020 Seq_VRE_hybridassembly
drwxr-xr-x 1 jhuang jhuang 12K Nov 11 2020 Data_Anna12_HAPDICS_raw_data_shovill_prokka
drwxr-xr-x 1 jhuang jhuang 12K Aug 10 2021 Data_Anna_HAPDICS_WGS_ALL
drwxr-xr-x 1 jhuang jhuang 20K Aug 10 2021 Data_HEV_Freiburg_2020
drwxr-xr-x 1 jhuang jhuang 20K Okt 27 2021 Data_Nicole_HDV_Recombination_PUBLISHED
-rwxr-xr-x 1 jhuang jhuang 905K Feb 8 2022 s_hero2x
-rwxr-xr-x 1 jhuang jhuang 5,5G Feb 25 2022 201030_M03701_0207_000000000-J57B4.zip
-rwxr-xr-x 1 jhuang jhuang 4,9K Mär 21 2022 README
-rwxr-xr-x 1 jhuang jhuang 4,9K Mär 21 2022 'README(1)'
-rwxr-xr-x 1 jhuang jhuang 848 Mär 28 2022 dna2.fasta.fai
-rwxr-xr-x 1 jhuang jhuang 17K Mär 28 2022 91.pep
-rwxr-xr-x 1 jhuang jhuang 9,1K Mär 28 2022 91.orf
-rwxr-xr-x 1 jhuang jhuang 222 Mär 28 2022 91.orf.fai
-rwxr-xr-x 1 jhuang jhuang 1,1M Mär 31 2022 dgaston-dec-06-2012-121211124858-phpapp01.pdf
-rwxr-xr-x 1 jhuang jhuang 5,2K Apr 4 2022 tileshop.fcgi
-rwxr-xr-x 1 jhuang jhuang 765K Apr 4 2022 ppat.1009304.s016.tif
-rwxr-xr-x 1 jhuang jhuang 4,1K Mai 2 2022 sequence.txt
-rwxr-xr-x 1 jhuang jhuang 4,0K Mai 2 2022 'sequence(1).txt'
-rwxr-xr-x 1 jhuang jhuang 3,7K Mai 23 2022 GSE128169_series_matrix.txt.gz
-rwxr-xr-x 1 jhuang jhuang 4,0K Mai 23 2022 GSE128169_family.soft.gz
drwxr-xr-x 1 jhuang jhuang 40K Mär 20 2023 Data_Anna_HAPDICS_RNASeq
drwxr-xr-x 1 jhuang jhuang 1,3M Apr 4 2023 Data_Christopher_MeDIP_MMc_PUBLISHED
drwxr-xr-x 1 jhuang jhuang 8,0K Jun 28 2023 Data_Gunnar_Yersiniomics_IMCOMPLETE_DEL
drwxr-xr-x 1 jhuang jhuang 28K Feb 12 2024 Data_Denise_RNASeq
drwxr-xr-x 1 jhuang jhuang 4,0K Apr 5 2024 'System Volume Information'
drwxr-xr-x 1 jhuang jhuang 0 Apr 5 2024 '$RECYCLE.BIN'
jhuang@WS-2290C:/media/jhuang/Elements(An14_RNAs)$ ls -tlrh
total 284K
drwxr-xr-x 1 jhuang jhuang 8,0K Aug 7 2017 Data_Anna10_RP62A
drwxr-xr-x 1 jhuang jhuang 4,0K Jun 15 2018 Data_Nicole12_16S_Kluwe_Bunders
drwxr-xr-x 1 jhuang jhuang 4,0K Nov 30 2018 chromhmm-enhancers
drwxr-xr-x 1 jhuang jhuang 0 Apr 1 2019 Data_Denise_sT_Methylation
drwxr-xr-x 1 jhuang jhuang 0 Apr 1 2019 Data_Denise_LTtrunc_Methylation
drwxr-xr-x 1 jhuang jhuang 12K Apr 29 2019 Data_16S_arckNov
drwxr-xr-x 1 jhuang jhuang 4,0K Mai 29 2019 Data_Tabea_RNASeq
-rwxr-xr-x 1 jhuang jhuang 4,6K Mai 29 2019 nr_gz_README
drwxr-xr-x 1 jhuang jhuang 4,0K Jun 5 2019 j_huang_raw_fq
drwxr-xr-x 1 jhuang jhuang 0 Jun 7 2019 'System Volume Information'
drwxr-xr-x 1 jhuang jhuang 0 Jun 7 2019 '$RECYCLE.BIN'
drwxr-xr-x 1 jhuang jhuang 36K Jun 18 2019 host_refs
drwxr-xr-x 1 jhuang jhuang 0 Jun 18 2019 Vraw
drwxr-xr-x 1 jhuang jhuang 68K Jul 29 2019 Data_Susanne_Amplicon_RdRp_orf1_2 *
drwxr-xr-x 1 jhuang jhuang 4,0K Aug 6 2019 tmp
drwxr-xr-x 1 jhuang jhuang 28K Sep 4 2020 Data_RNA188_Paul_Becher
drwxr-xr-x 1 jhuang jhuang 4,0K Nov 3 2020 Data_ChIPSeq_Laura
drwxr-xr-x 1 jhuang jhuang 12K Mai 7 2021 Data_16S_arckNov_review_PUBLISHED
drwxr-xr-x 1 jhuang jhuang 8,0K Mai 7 2021 Data_16S_arckNov_re
drwxr-xr-x 1 jhuang jhuang 20K Mai 25 2021 Fastqs
drwxr-xr-x 1 jhuang jhuang 4,0K Aug 9 2021 Data_Tabea_RNASeq_submission
drwxr-xr-x 1 jhuang jhuang 4,0K Aug 27 2021 Data_Anna_Cutibacterium_acnes_DEL
drwxr-xr-x 1 jhuang jhuang 0 Sep 16 2021 Data_Silvia_RNASeq_SUBMISSION
drwxr-xr-x 1 jhuang jhuang 4,0K Feb 9 2022 Data_Hannes_ChIPSeq
drwxr-xr-x 1 jhuang jhuang 4,0K Jul 5 2022 Data_Anna14_RNASeq_to_be_DEL
drwxr-xr-x 1 jhuang jhuang 40K Dez 15 2022 Data_Pietschmann_RSV_Probe2_PUBLISHED
drwxr-xr-x 1 jhuang jhuang 0 Dez 16 2022 Data_Holger_Klebsiella_pneumoniae_SNP_PUBLISHING
drwxr-xr-x 1 jhuang jhuang 4,0K Jun 29 2023 Data_Anna14_RNASeq_plus_public
jhuang@WS-2290C:/media/jhuang/Elements(Indra_HAPDICS)$ ls -trlh
total 452K
drwxrwxrwx 1 jhuang jhuang 20K Jul 3 2018 Data_Anna11_Sepdermidis_DEL
drwxrwxrwx 1 jhuang jhuang 20K Jul 12 2018 HD15_without_10
drwxrwxrwx 1 jhuang jhuang 12K Jul 12 2018 HD31
drwxrwxrwx 1 jhuang jhuang 20K Jul 12 2018 HD33
drwxrwxrwx 1 jhuang jhuang 20K Jul 12 2018 HD39
drwxrwxrwx 1 jhuang jhuang 20K Jul 12 2018 HD43
drwxrwxrwx 1 jhuang jhuang 20K Jul 12 2018 HD46
drwxrwxrwx 1 jhuang jhuang 20K Jul 12 2018 HD15_with_10
drwxrwxrwx 1 jhuang jhuang 12K Jul 13 2018 HD26
drwxrwxrwx 1 jhuang jhuang 20K Jul 13 2018 HD59
drwxrwxrwx 1 jhuang jhuang 12K Jul 13 2018 HD25
drwxrwxrwx 1 jhuang jhuang 20K Jul 16 2018 HD21
drwxrwxrwx 1 jhuang jhuang 20K Jul 17 2018 HD17
drwxrwxrwx 1 jhuang jhuang 24K Sep 24 2018 HD04
drwxrwxrwx 1 jhuang jhuang 20K Mär 5 2019 Data_Anna11_Pair1-6_P6
drwxrwxrwx 1 jhuang jhuang 4,0K Aug 15 2019 Data_Anna12_HAPDICS_HyAsP
drwxrwxrwx 1 jhuang jhuang 68K Dez 27 2019 HAPDICS_hyasp_plasmids
drwxrwxrwx 1 jhuang jhuang 8,0K Jan 14 2021 Data_Anna_HAPDICS_review
-rwxrwxrwx 1 jhuang jhuang 9,6K Jan 26 2021 data_overview.txt
drwxrwxrwx 1 jhuang jhuang 4,0K Jan 29 2021 align_assem_res_DEL
drwxrwxrwx 1 jhuang jhuang 0 Jun 8 2021 'System Volume Information'
drwxrwxrwx 1 jhuang jhuang 4,0K Jun 8 2021 EXCHANGE_DEL
drwxrwxrwx 1 jhuang jhuang 8,0K Aug 30 2021 Data_Indra_H3K4me3_public
drwxrwxrwx 1 jhuang jhuang 4,0K Feb 17 2022 Data_Gunnar_MS
drwxrwxrwx 1 jhuang jhuang 4,0K Jun 2 2022 '$RECYCLE.BIN'
drwxrwxrwx 1 jhuang jhuang 4,0K Jun 2 2022 UKE_DELLWorkstation_C_Users_indbe_Desktop
drwxrwxrwx 1 jhuang jhuang 4,0K Jun 2 2022 Linux_DELLWorkstation_C_Users_indbe_VirtualBoxVMs
drwxrwxrwx 1 jhuang jhuang 4,0K Jun 23 2022 Data_Anna_HAPDICS_RNASeq_rawdata
drwxrwxrwx 1 jhuang jhuang 8,0K Jun 23 2022 Data_Indra_H3K27ac_public
drwxrwxrwx 1 jhuang jhuang 28K Feb 22 2023 Data_Holger_Klebsiella_pneumoniae_SNP_PUBLISHING
drwxrwxrwx 1 jhuang jhuang 4,0K Dez 9 2024 DATA_INDRA_RNASEQ
drwxrwxrwx 1 jhuang jhuang 4,0K Dez 9 2024 DATA_INDRA_CHIPSEQ
jhuang@WS-2290C:/media/jhuang/Elements(jhuang_*)$ ls -ltrh
total 5,0M
-rwxr-xr-x 1 jhuang jhuang 657K Jul 9 2021 'Install Western Digital Software for Windows.exe'
-rwxr-xr-x 1 jhuang jhuang 498K Jul 9 2021 'Install Western Digital Software for Mac.dmg'
drwxr-xr-x 2 jhuang jhuang 1,0M Mai 17 2023 'System Volume Information'
drwxr-xr-x 2 jhuang jhuang 1,0M Aug 26 2024 '$RECYCLE.BIN'
drwxr-xr-x 11 jhuang jhuang 1,0M Feb 4 2025 20250203_FS10003086_95_BTR67811-0621
jhuang@WS-2290C:/media/jhuang/Smarty$ ls -tlrh
total 140K
drwx------ 2 jhuang jhuang 16K Mär 14 2018 lost+found
drwxrwxrwx 21 jhuang jhuang 68K Jun 10 2022 Blast_db
drwxrwxr-x 2 jhuang jhuang 4,0K Sep 5 2022 temporary_files_DEL
drwxrwxr-x 9 jhuang jhuang 12K Sep 6 2022 ALIGN_ASSEM
drwxr-xr-x 19 jhuang jhuang 4,0K Sep 29 2022 Data_Paul_Staphylococcus_epidermidis
drwxrwxr-x 11 jhuang jhuang 4,0K Jan 26 2023 Data_16S_Degenhardt_Marius_DEL
drwxrwxr-x 16 jhuang jhuang 4,0K Jun 28 2023 Data_Gunnar_Yersiniomics_DEL
drwxrwxr-x 6 jhuang jhuang 4,0K Jul 5 2023 Data_Manja_RNAseq_Organoids_Virus
drwxrwxr-x 19 jhuang jhuang 12K Sep 27 2023 Data_Emilia_MeDIP
drwxr-xr-x 14 jhuang jhuang 4,0K Okt 30 2023 DjangoApp_Backup_2023-10-30
drwxrwxr-x 5 jhuang jhuang 4,0K Apr 19 2024 ref
drwxrwxr-x 4 jhuang jhuang 4,0K Jul 22 2025 Data_Michelle_RNAseq_2025_raw_data_DEL_AFTER_UPLOAD_GEO Protected: SARS-CoV-2 病毒血症与免疫介导的组织侵入在 COVID-19 血管病理中的作用:一项基于尸检的研究。
按研究方向速查:生命科学常用数据库清单(Global Core Biodata Resources 精选)
下面是一份“按研究方向推荐常用数据库”的速查清单。按你做什么研究 → 该先去哪几个库来分组,每个库后面给一句“用来干嘛”。In summary, 做基因组先看 ENA/Ensembl/UCSC;做蛋白功能先上 UniProt/InterPro;做通路用 Reactome;做药物与小分子用 ChEMBL/ChEBI;做人类变异用 gnomAD/GWAS Catalog/ClinGen;做微生物命名和 16S 用 LPSN/SILVA;做模型生物就去 FlyBase/WormBase/ZFIN/MGD。
基因组学与序列数据
- European Nucleotide Archive(ENA):原始测序数据、组装、注释的综合归档入口(欧洲体系)。
- DNA Data Bank of Japan(DDBJ):日本序列数据归档(INSDC 成员之一)。
- Ensembl:脊椎动物基因组浏览、比较基因组、变异、调控注释。
- UCSC Genome Browser:人类及多物种基因组可视化浏览与注释轨道。
- GENCODE:人/鼠高质量基因注释集合(常做标准参考)。
微生物/细菌方向(菌株信息、命名、16S 等)
- BacDive:菌株层面的标准化信息(培养条件、表型、来源等)。
- LPSN: List of Prokaryotic names with Standing in Nomenclature:原核命名权威信息(名称是否有效、分类更新)。
- SILVA:16S/18S、23S/28S rRNA 序列与比对数据集(做分类/扩增子常用)。
蛋白质功能注释、家族结构域、互作网络
- UniProt:蛋白序列与功能注释的“总入口”(最常用)。
- InterPro:蛋白家族/结构域/功能位点整合分析(做注释和功能预测)。
- CATH:蛋白结构域进化关系/结构分类。
- STRING:蛋白互作网络(预测+整合证据),做功能关联很方便。
- IMEx: International Molecular Exchange Consortium:高质量、人工整理的分子互作数据整合。
- Protein Data Bank(PDB):蛋白/核酸 3D 结构的全球档案库(结构生物学必备)。
通路、代谢与反应数据库
- Reactome:经典通路知识库(富集分析、机制解释常用)。
- Rhea:生化反应与转运反应标准化知识库(注释/代谢研究)。
- BRENDA:酶功能数据大全(底物、动力学、反应等)。
- EcoCyc:大肠杆菌 K-12 的基因组与代谢通路精细注释库。
化学、小分子、药物靶点(药物研发/化学生物学)
- ChEBI:小分子化学实体词典/本体(标准名、结构、分类)。
- ChEMBL:药物样分子、活性、靶点关联(做药物发现/重定位很常用)。
- IUPHAR/BPS Guide to PHARMACOLOGY:权威药理学知识库(配体-靶点关系、药物信息)。
- LIPID MAPS:脂质组学资源与命名/分类体系。
转录组、表达谱、蛋白表达图谱
- Bgee:跨物种表达模式对比(“这个基因在哪里表达?”)。
- GXD:小鼠基因表达数据库(发育/组织表达等)。
- Human Protein Atlas:人类组织/细胞层面的蛋白表达与定位图谱。
- Europe PMC:生命科学文献入口(全文/摘要、资助信息等,做调研很高效)。
人类遗传变异、GWAS、疾病本体与临床解释
- gnomAD:人群变异频率汇总(过滤“常见变异”必备)。
- GWAS Catalog:GWAS SNP-性状关联的标准化数据库。
- Clinical Genome Resource(ClinGen):基因/变异的临床相关性评估资源(精准医学)。
- CIViC: Clinical Interpretation of Variants in Cancer:肿瘤变异临床意义的社区整理平台。
- Human Disease Ontology Knowledgebase:疾病本体(统一术语、做整合分析很有用)。
- ClinPGX:药物基因组学知识整理(基因变异影响用药反应)。
模型生物与专属物种数据库
- FlyBase:果蝇遗传与分子数据。
- WormBase:秀丽线虫及相关线虫的基因组与生物学数据。
- ZFIN: The Zebrafish Information Network:斑马鱼模型数据。
- MGD: Mouse Genome Database:小鼠基因组与表型/疾病关联数据。
- PomBase:裂殖酵母资源库。
- Saccharomyces Genome Database:出芽酵母数据库。
- Rat Genome Database:大鼠基因组与表型/疾病数据。
- Alliance of Genome Resources:多模型生物资源的整合入口(跨物种对照很方便)。
生物多样性、物种名录与分类学
- Catalogue of Life:全球已知物种的统一名录与分类信息。
- Global Biodiversity Information Facility(GBIF):全球生物多样性观测/标本记录等开放数据平台。
病原体与媒介(寄生虫/媒介昆虫等)
- VEuPathDB:真核病原体及无脊椎媒介相关的大规模组学数据库集合。
NCBI 提交入口怎么选?一张“决策树”帮你不走弯路(GenBank / SRA / Genome / TSA / BioProject / BioSample / dbGaP / GTR / ClinVar)
很多人第一次在 NCBI 点“Start a new submission”会懵:这么多入口到底选哪个?下面给你一棵从目标出发的决策树,按着走基本不会错。如果你要公开的是“数据文件”(FASTQ/FASTA/组装/注释),不要选 GTR;如果你要公开的是“某个临床/研究检测项目的服务说明”,才选 GTR。
✅ 第一步:你要提交的是“原始测序数据”还是“组装/注释结果”?
A. 我有 FASTQ(原始 reads:Illumina/Nanopore/PacBio)
➡️ 选 Sequence Read Archive (SRA)
- 你提交的是:reads + 文库信息(平台、PE/SE、策略等)
- 几乎所有文章要求原始数据可复现,都需要 SRA
同时你通常还需要:
- BioSample(每个样本的“身份证”)
- BioProject(把整个项目的数据串起来)
✅ 常见路径:BioProject → BioSample → SRA
B. 我有组装好的基因组(contigs/scaffolds/complete genome)
➡️ 选 Genome(基因组提交主入口)
- 适合:细菌/真菌/病毒/真核的 draft 或 complete genome
- 会与 GenBank/Assembly 体系关联(后续可公开检索引用)
同时通常还需要:
- BioSample(样本来源信息)
- BioProject(项目汇总)
- (可选但强烈建议)SRA(如果你也愿意公开原始 reads)
✅ 常见路径:BioProject → BioSample → SRA(可选/建议)→ Genome
C. 我只有一个基因/片段/质粒序列(不是整套基因组项目)
➡️ 选 GenBank
- 适合:单基因、片段序列、单独的质粒序列、特定区域序列
- 如果你在做“系统的基因组项目”,通常走 Genome 更合适;GenBank更像“序列条目提交”。
D. 我有转录组拼装结果(assembled transcripts,不是 reads)
➡️ 选 TSA(Transcriptome Shotgun Assembly)
- TSA 提交的是:拼装后的转录本序列
- 原始 RNA-seq reads 仍应走 SRA
✅ 常见路径:BioProject → BioSample → SRA → TSA
✅ 第二步:你提交的是“临床敏感人类数据/变异解释/检测项目”吗?
E. 数据涉及人类受试者隐私、需要受控访问(表型+基因型/临床队列)
➡️ 选 dbGaP(受控访问)
- 适合:人类敏感数据
- 常伴随伦理/权限/审查流程(不是完全公开下载)
F. 你要提交“变异的临床意义解读”(致病性、证据、表型关联)
➡️ 选 ClinVar
- 适合:临床实验室/研究团队共享变异解释
G. 你要登记“遗传检测项目/检测服务信息”
➡️ 选 GTR(Genetic Testing Registry)
- 更像“检测项目注册”,不是上传测序数据本体
✅ 第三步:你是不是在管理一个“项目集合”?
H. 你有多个样本/多批数据/多类型数据(SRA + Genome + 其它)
➡️ 建议先建 BioProject
- 作用:项目总目录,方便引用与检索
I. 你每一个样本都需要可追溯的元数据(来源、地点、日期、宿主等)
➡️ 基本都需要 BioSample
- 作用:样本身份证;SRA/Genome 通常都要挂它
终极“快速选择口诀”
- FASTQ 原始 reads → SRA
- 基因组组装(contigs/scaffolds/complete)→ Genome
- 转录本拼装(transcripts)→ TSA
- 单基因/片段/质粒序列条目 → GenBank
- 把所有东西串成一个项目 → BioProject
- 每个样本来源信息 → BioSample
- 人类敏感受控数据 → dbGaP
- 临床变异解释 → ClinVar
- 遗传检测项目登记 → GTR
- 批量/自动化 → API
下面是对 GTR(Genetic Testing Registry,遗传检测注册库) 的更详细中文说明。
GTR 是什么?
GTR 是 NCBI 上一个“登记遗传检测项目/检测服务信息”的公共目录,由提供检测的实验室/机构自愿提交,目的是让公众、临床医生和研究人员能查到:某个疾病/基因/病原体有哪些检测、由哪些实验室提供、检测方法是什么、适用范围和证据如何等。(NCBI)
关键点:GTR 不是用来上传 FASTQ/基因组序列的。
- 原始测序数据 → SRA
- 基因组组装/注释 → Genome / GenBank
- GTR → 登记“检测项目本身”的信息(类似检测项目黄页/目录) (NCBI)
GTR 收录哪些“检测”?
GTR 的范围不仅是传统“单基因遗传病检测”,也包括:
- 孟德尔遗传病、药物反应(药物基因组学)相关检测
- 肿瘤/体细胞变异检测
- 多基因 panel、芯片(array)、生化、细胞遗传、分子检测 (NCBI)
- 微生物/病原体相关检测(例如病原体 panel、病毒载量、血清学抗体/抗原检测等) (NCBI)
在 GTR 里,一个“检测条目”通常会包含哪些信息?
你可以把它理解为“一个检测项目的说明书 + 实验室信息”组合,常见字段包括:
- 检测目的/用途:诊断、携带者筛查、预后、用药指导等 (NCBI)
- 检测对象(Target):基因/区域、变异类型、或病原体靶标等
- 方法学(Methodology):例如 PCR、Sanger、NGS panel、MLPA、芯片、qPCR、Nanopore 等(写清楚平台与策略)(NCBI)
- 适应证/关联疾病(Indication/Condition):对应哪些疾病/表型;并可建立“检测—靶标—适应证”的声明关系 (NCBI)
- 性能与证据:分析/临床有效性、参考文献、指南或标准等(GTR强调用途与证据展示)(NCBI)
- 实验室信息:机构名称、联系人、资质/认证信息、可提供的服务范围等 (NCBI)
- GTR accession:每个检测都有唯一编号,便于在论文/EHR 中引用。(NCBI)
谁应该提交 GTR?
主要是提供遗传/分子检测服务的实验室或机构(临床检验科、第三方医学检验所、商业检测机构、研究机构实验室等)。(NCBI)
如果你只是做科研并想公开数据:
- 数据公开通常走 BioProject/BioSample + SRA + Genome/GenBank
- 不一定需要 GTR(除非你在对外提供一个“检测项目/检测服务”)(NCBI)
GTR 怎么提交?(流程概览)
GTR 提交一般是两步走:
1)先注册“实验室(Laboratory record)”
先把实验室作为一个实体登记,GTR 会审核/联系新注册者;实验室通过后才可以提交具体检测项目。(NCBI)
2)再提交“检测(Test record)”
有两种方式:
- 网页交互式提交:在提交门户里逐页填写信息(适合少量检测)(NCBI)
- 批量提交(Excel 模板):适合大量临床检测项目;可用全字段或最小字段模板上传(研究检测的批量上传通常不开放/不支持)。(NCBI)
GTR vs ClinVar vs dbGaP:最容易混淆的三兄弟
- GTR:登记“检测项目/检测服务”信息(谁提供、怎么测、测什么、适应证/证据)(NCBI)
- ClinVar:提交“变异—临床意义”的解释与证据(致病性分类等)(你贴里之前也提过)
- dbGaP:人类敏感数据(基因型/表型)受控访问的归档库
From Salmon to Subset Heatmaps: A Reproducible Pipeline for Phage/Stress/Biofilm Gene Panels (No p-value Cutoff, Data_JuliaFuchs_RNAseq_2025)

This post documents a complete, batch-ready pipeline to generate subset heatmaps (phage / stress-response / biofilm-associated) from bacterial RNA-seq data quantified with Salmon, using DE tables without any p-value cutoff.
You will end with:
-
Three gene sets (A/B/C):
- A (phage/prophage genes): extracted from MT880872.1.gb, mapped to CP052959 via BLASTN, converted to CP052959
GeneID_plain - B (stress genes): keyword-based selection from CP052959 GenBank annotations
- C (biofilm genes): keyword-based selection from CP052959 GenBank annotations
- A (phage/prophage genes): extracted from MT880872.1.gb, mapped to CP052959 via BLASTN, converted to CP052959
-
For each
*-all_annotated.csvinresults/star_salmon/degenes/:- Subset GOI lists for A/B/C (no cutoff; include all rows belonging to the geneset)
- Per-comparison
*_matched.tsvtables for sanity checks
- Merged 3-condition heatmaps (Untreated + Mitomycin + Moxi) per timepoint (4h/8h/18h) and subset (A/B/C), giving 9 final figures
- An Excel file per heatmap containing
GeneID,GeneName,Description, and the plotted expression matrix
Everything is written so you can run a single shell script for genesets + intersections, then one R script for heatmaps.
0) Environments
We use two conda environments:
plot-numpy1for Python tools and BLAST setupr_envfor DESeq2 + plotting heatmaps in R
conda activate plot-numpy1
1) Directory layout
From your project root:
.
├── CP052959.gb
├── MT880872.1.gb
├── results/star_salmon/degenes/
│ ├── Mitomycin_4h_vs_Untreated_4h-all_annotated.csv
│ ├── ...
└── subset_heatmaps/ # all scripts + outputs go here
Create the output directory:
mkdir -p subset_heatmaps
2) Step A/B/C gene set generation + batch intersection (one command)
This section generates:
geneset_A_phage_GeneID_plain.id(+GeneID.id)geneset_B_stress_GeneID_plain.id(+GeneID.id)geneset_C_biofilm_GeneID_plain.id(+GeneID.id)- plus all per-contrast
GOI_*files and*_matched.tsv
2.1 Script: extract CDS FASTA from MT880872.1.gb
Save as subset_heatmaps/extract_cds_fasta.py
#!/usr/bin/env python3
from Bio import SeqIO
import sys
gb = sys.argv[1]
out_fa = sys.argv[2]
rec = SeqIO.read(gb, "genbank")
with open(out_fa, "w") as out:
for f in rec.features:
if f.type != "CDS":
continue
locus = f.qualifiers.get("locus_tag", ["NA"])[0]
seq = f.extract(rec.seq)
out.write(f">{locus}\n{str(seq).upper()}\n")
2.2 Script: BLAST hit mapping → CP052959 GeneID_plain set (geneset A)
Save as subset_heatmaps/blast_hits_to_geneset.py
#!/usr/bin/env python3
import sys
import pandas as pd
from Bio import SeqIO
blast6 = sys.argv[1]
cp_gb = sys.argv[2]
prefix = sys.argv[3] # e.g. subset_heatmaps/geneset_A_phage
# Load CP052959 CDS intervals
rec = SeqIO.read(cp_gb, "genbank")
cds = []
for f in rec.features:
if f.type != "CDS":
continue
locus = f.qualifiers.get("locus_tag", [None])[0]
if locus is None:
continue
start = int(f.location.start) + 1
end = int(f.location.end)
cds.append((locus, start, end))
cds_df = pd.DataFrame(cds, columns=["GeneID_plain","start","end"])
# Load BLAST tabular (outfmt 6)
cols = ["qseqid","sseqid","pident","length","mismatch","gapopen","qstart","qend",
"sstart","send","evalue","bitscore"]
b = pd.read_csv(blast6, sep="\t", names=cols)
# Normalize subject coordinates
b["smin"] = b[["sstart","send"]].min(axis=1)
b["smax"] = b[["sstart","send"]].max(axis=1)
# Filter for strong hits (tune if needed)
b = b[(b["pident"] >= 90.0) & (b["length"] >= 100)]
hits = set()
for _, r in b.iterrows():
ov = cds_df[(r["smin"] <= cds_df["end"]) & (r["smax"] >= cds_df["start"])]
hits.update(ov["GeneID_plain"].unique().tolist())
hits = sorted(hits)
plain_path = f"{prefix}_GeneID_plain.id"
geneid_path = f"{prefix}_GeneID.id"
pd.Series(hits).to_csv(plain_path, index=False, header=False)
pd.Series(["gene-" + x for x in hits]).to_csv(geneid_path, index=False, header=False)
print(f"Wrote {len(hits)} genes:")
print(" ", plain_path)
print(" ", geneid_path)
2.3 Script: keyword-based genesets B/C from CP052959 annotations
Save as subset_heatmaps/geneset_by_keywords.py
#!/usr/bin/env python3
import sys, re
import pandas as pd
from Bio import SeqIO
cp_gb = sys.argv[1]
mode = sys.argv[2] # "stress" or "biofilm"
prefix = sys.argv[3] # e.g. subset_heatmaps/geneset_B_stress
rec = SeqIO.read(cp_gb, "genbank")
rows=[]
for f in rec.features:
if f.type != "CDS":
continue
locus = f.qualifiers.get("locus_tag", [None])[0]
if locus is None:
continue
gene = (f.qualifiers.get("gene", [""])[0] or "")
product = (f.qualifiers.get("product", [""])[0] or "")
note = "; ".join(f.qualifiers.get("note", [])) if f.qualifiers.get("note") else ""
text = " ".join([gene, product, note]).strip()
rows.append((locus, gene, product, note, text))
df = pd.DataFrame(rows, columns=["GeneID_plain","gene","product","note","text"])
if mode == "stress":
rgx = re.compile(
r"\b(stress|heat shock|chaperone|dnaK|groEL|groES|clp|thioredoxin|peroxiredoxin|catalase|superoxide|"
r"recA|lexA|uvr|mutS|mutL|usp|osm|sox|katA|sod)\b",
re.I
)
elif mode == "biofilm":
rgx = re.compile(
r"\b(biofilm|ica|pga|polysaccharide|PIA|adhesin|MSCRAMM|fibrinogen-binding|fibronectin-binding|"
r"clumping factor|sortase|autolysin|atl|nuclease|DNase|protease|dispersin|luxS|agr|sarA|dlt)\b",
re.I
)
else:
raise SystemExit("mode must be stress or biofilm")
sel = df[df["text"].apply(lambda x: bool(rgx.search(x)))].copy()
hits = sorted(sel["GeneID_plain"].unique())
plain_path = f"{prefix}_GeneID_plain.id"
geneid_path = f"{prefix}_GeneID.id"
sel_path = f"{prefix}_hits.tsv"
pd.Series(hits).to_csv(plain_path, index=False, header=False)
pd.Series(["gene-" + x for x in hits]).to_csv(geneid_path, index=False, header=False)
sel.drop(columns=["text"]).to_csv(sel_path, sep="\t", index=False)
print(f"{mode}: wrote {len(hits)} genes:")
print(" ", plain_path)
print(" ", geneid_path)
print(" ", sel_path)
2.4 Script: intersect each DE table with A/B/C (no cutoff) and write GOI lists + matched TSV
Save as subset_heatmaps/make_goi_lists_batch.py
#!/usr/bin/env python3
import sys, glob, os
import pandas as pd
de_dir = sys.argv[1] # results/star_salmon/degenes
out_dir = sys.argv[2] # subset_heatmaps
genesetA_plain = sys.argv[3] # subset_heatmaps/geneset_A_phage_GeneID_plain.id
genesetB_plain = sys.argv[4] # subset_heatmaps/geneset_B_stress_GeneID_plain.id
genesetC_plain = sys.argv[5] # subset_heatmaps/geneset_C_biofilm_GeneID_plain.id
def load_plain_ids(path):
with open(path) as f:
return set(x.strip() for x in f if x.strip())
A = load_plain_ids(genesetA_plain)
B = load_plain_ids(genesetB_plain)
C = load_plain_ids(genesetC_plain)
def pick_id_cols(df):
geneid = "GeneID" if "GeneID" in df.columns else None
plain = "GeneID_plain" if "GeneID_plain" in df.columns else None
if plain is None and "GeneName" in df.columns:
plain = "GeneName"
return geneid, plain
os.makedirs(out_dir, exist_ok=True)
for csv in sorted(glob.glob(os.path.join(de_dir, "*-all_annotated.csv"))):
base = os.path.basename(csv).replace("-all_annotated.csv", "")
df = pd.read_csv(csv)
geneid_col, plain_col = pick_id_cols(df)
if plain_col is None:
raise SystemExit(f"Cannot find GeneID_plain/GeneName in {csv}")
df["__plain__"] = df[plain_col].astype(str).str.replace("^gene-","", regex=True)
def write_set(tag, S):
sub = df[df["__plain__"].isin(S)].copy()
out_plain = os.path.join(out_dir, f"GOI_{base}_{tag}_GeneID_plain.id")
out_geneid = os.path.join(out_dir, f"GOI_{base}_{tag}_GeneID.id")
out_tsv = os.path.join(out_dir, f"{base}_{tag}_matched.tsv")
sub["__plain__"].drop_duplicates().to_csv(out_plain, index=False, header=False)
pd.Series(["gene-"+x for x in sub["__plain__"].drop_duplicates()]).to_csv(out_geneid, index=False, header=False)
sub.to_csv(out_tsv, sep="\t", index=False)
print(f"{base} {tag}: {sub.shape[0]} rows, {sub['__plain__'].nunique()} genes")
write_set("A_phage", A)
write_set("B_stress", B)
write_set("C_biofilm", C)
2.5 Driver: run everything with one command
Save as subset_heatmaps/run_subset_setup.sh
#!/usr/bin/env bash
set -euo pipefail
DE_DIR="./results/star_salmon/degenes"
OUT_DIR="./subset_heatmaps"
CP_GB="CP052959.gb"
PHAGE_GB="MT880872.1.gb"
mkdir -p "$OUT_DIR"
echo "[INFO] Using DE_DIR=$DE_DIR"
ls -lh "$DE_DIR"/*-all_annotated.csv
# ---- A) BLAST-based phage/prophage geneset ----
python - <<'PY'
from Bio import SeqIO
rec=SeqIO.read("CP052959.gb","genbank")
SeqIO.write(rec, "subset_heatmaps/CP052959.fna", "fasta")
PY
python subset_heatmaps/extract_cds_fasta.py "$PHAGE_GB" "$OUT_DIR/MT880872_CDS.fna"
makeblastdb -in "$OUT_DIR/CP052959.fna" -dbtype nucl -out "$OUT_DIR/CP052959_db" >/dev/null
blastn \
-query "$OUT_DIR/MT880872_CDS.fna" \
-db "$OUT_DIR/CP052959_db" \
-out "$OUT_DIR/MT_vs_CP.blast6" \
-outfmt "6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore" \
-evalue 1e-10
python subset_heatmaps/blast_hits_to_geneset.py \
"$OUT_DIR/MT_vs_CP.blast6" "$CP_GB" "$OUT_DIR/geneset_A_phage"
# ---- B/C) keyword-based genesets ----
python subset_heatmaps/geneset_by_keywords.py "$CP_GB" stress "$OUT_DIR/geneset_B_stress"
python subset_heatmaps/geneset_by_keywords.py "$CP_GB" biofilm "$OUT_DIR/geneset_C_biofilm"
# ---- Batch: intersect each DE CSV with the genesets (no cutoff) ----
python subset_heatmaps/make_goi_lists_batch.py \
"$DE_DIR" "$OUT_DIR" \
"$OUT_DIR/geneset_A_phage_GeneID_plain.id" \
"$OUT_DIR/geneset_B_stress_GeneID_plain.id" \
"$OUT_DIR/geneset_C_biofilm_GeneID_plain.id"
echo "[INFO] Done. GOI lists are in $OUT_DIR"
ls -1 "$OUT_DIR"/GOI_*_GeneID.id | head
Run it:
bash subset_heatmaps/run_subset_setup.sh
At this point you will have all *_matched.tsv files required for plotting, e.g.:
Mitomycin_4h_vs_Untreated_4h_A_phage_matched.tsvMoxi_4h_vs_Untreated_4h_A_phage_matched.tsv- … (for 8h/18h and B/C)
3) No-cutoff heatmaps (merged Untreated + Mitomycin + Moxi → 9 figures)
Now switch to your R environment and build the rlog (rld) expression matrix from Salmon quantifications.
conda activate r_env
3.1 Build rld from Salmon outputs (R)
library(tximport)
library(DESeq2)
setwd("~/DATA/Data_JuliaFuchs_RNAseq_2025/results/star_salmon")
files <- c(
"Untreated_4h_r1" = "./Untreated_4h_1a/quant.sf",
"Untreated_4h_r2" = "./Untreated_4h_1b/quant.sf",
"Untreated_4h_r3" = "./Untreated_4h_1c/quant.sf",
"Untreated_8h_r1" = "./Untreated_8h_1d/quant.sf",
"Untreated_8h_r2" = "./Untreated_8h_1e/quant.sf",
"Untreated_8h_r3" = "./Untreated_8h_1f/quant.sf",
"Untreated_18h_r1" = "./Untreated_18h_1g/quant.sf",
"Untreated_18h_r2" = "./Untreated_18h_1h/quant.sf",
"Untreated_18h_r3" = "./Untreated_18h_1i/quant.sf",
"Mitomycin_4h_r1" = "./Mitomycin_4h_2a/quant.sf",
"Mitomycin_4h_r2" = "./Mitomycin_4h_2b/quant.sf",
"Mitomycin_4h_r3" = "./Mitomycin_4h_2c/quant.sf",
"Mitomycin_8h_r1" = "./Mitomycin_8h_2d/quant.sf",
"Mitomycin_8h_r2" = "./Mitomycin_8h_2e/quant.sf",
"Mitomycin_8h_r3" = "./Mitomycin_8h_2f/quant.sf",
"Mitomycin_18h_r1" = "./Mitomycin_18h_2g/quant.sf",
"Mitomycin_18h_r2" = "./Mitomycin_18h_2h/quant.sf",
"Mitomycin_18h_r3" = "./Mitomycin_18h_2i/quant.sf",
"Moxi_4h_r1" = "./Moxi_4h_3a/quant.sf",
"Moxi_4h_r2" = "./Moxi_4h_3b/quant.sf",
"Moxi_4h_r3" = "./Moxi_4h_3c/quant.sf",
"Moxi_8h_r1" = "./Moxi_8h_3d/quant.sf",
"Moxi_8h_r2" = "./Moxi_8h_3e/quant.sf",
"Moxi_8h_r3" = "./Moxi_8h_3f/quant.sf",
"Moxi_18h_r1" = "./Moxi_18h_3g/quant.sf",
"Moxi_18h_r2" = "./Moxi_18h_3h/quant.sf",
"Moxi_18h_r3" = "./Moxi_18h_3i/quant.sf"
)
txi <- tximport(files, type = "salmon", txIn = TRUE, txOut = TRUE)
replicate <- factor(rep(c("r1","r2","r3"), 9))
condition <- factor(c(
rep("Untreated_4h",3), rep("Untreated_8h",3), rep("Untreated_18h",3),
rep("Mitomycin_4h",3), rep("Mitomycin_8h",3), rep("Mitomycin_18h",3),
rep("Moxi_4h",3), rep("Moxi_8h",3), rep("Moxi_18h",3)
))
colData <- data.frame(condition=condition, replicate=replicate, row.names=names(files))
dds <- DESeqDataSetFromTximport(txi, colData, design = ~ condition)
rld <- rlogTransformation(dds)
3.2 Plot merged 3-condition subset heatmaps (R)
suppressPackageStartupMessages(library(gplots))
need <- c("openxlsx")
to_install <- setdiff(need, rownames(installed.packages()))
if (length(to_install)) install.packages(to_install, repos = "https://cloud.r-project.org")
suppressPackageStartupMessages(library(openxlsx))
in_dir <- "subset_heatmaps"
out_dir <- file.path(in_dir, "heatmaps_merged3")
dir.create(out_dir, showWarnings = FALSE, recursive = TRUE)
pick_col <- function(df, candidates) {
hit <- intersect(candidates, names(df))
if (length(hit) == 0) return(NA_character_)
hit[1]
}
strip_gene_prefix <- function(x) sub("^gene[-_]", "", x)
match_tags <- function(nms, tags) {
pat <- paste0("(^|_)(?:", paste(tags, collapse = "|"), ")(_|$)")
grepl(pat, nms, perl = TRUE)
}
detect_tag <- function(nm, tags) {
hits <- vapply(tags, function(t)
grepl(paste0("(^|_)", t, "(_|$)"), nm, perl = TRUE), logical(1))
if (!any(hits)) NA_character_ else tags[which(hits)[1]]
}
make_pretty_labels <- function(gene_ids_in_matrix, id2name, id2desc) {
plain <- strip_gene_prefix(gene_ids_in_matrix)
nm <- unname(id2name[plain]); ds <- unname(id2desc[plain])
nm[is.na(nm)] <- ""; ds[is.na(ds)] <- ""
nm2 <- ifelse(nzchar(nm), nm, plain)
lbl <- ifelse(nzchar(ds), paste0(nm2, " (", ds, ")"), nm2)
make.unique(lbl, sep = "_")
}
if (exists("rld")) {
expr_all <- assay(rld)
} else if (exists("vsd")) {
expr_all <- assay(vsd)
} else {
stop("Neither 'rld' nor 'vsd' exists. Create/load it before running this script.")
}
expr_all <- as.matrix(expr_all)
mat_ids <- rownames(expr_all)
if (is.null(mat_ids)) stop("Expression matrix has no rownames.")
times <- c("4h", "8h", "18h")
tags <- c("A_phage", "B_stress", "C_biofilm")
cond_order_template <- c("Untreated_%s", "Mitomycin_%s", "Moxi_%s")
for (tt in times) {
for (tag in tags) {
f_mito <- file.path(in_dir, sprintf("Mitomycin_%s_vs_Untreated_%s_%s_matched.tsv", tt, tt, tag))
f_moxi <- file.path(in_dir, sprintf("Moxi_%s_vs_Untreated_%s_%s_matched.tsv", tt, tt, tag))
if (!file.exists(f_mito) || !file.exists(f_moxi)) next
df1 <- read.delim(f_mito, sep = "\t", header = TRUE, stringsAsFactors = FALSE, check.names = FALSE)
df2 <- read.delim(f_moxi, sep = "\t", header = TRUE, stringsAsFactors = FALSE, check.names = FALSE)
id_col_1 <- pick_col(df1, c("GeneID","GeneID_plain","Gene_Id","gene_id","locus_tag","LocusTag","ID"))
id_col_2 <- pick_col(df2, c("GeneID","GeneID_plain","Gene_Id","gene_id","locus_tag","LocusTag","ID"))
if (is.na(id_col_1) || is.na(id_col_2)) next
name_col_1 <- pick_col(df1, c("GeneName","Preferred_name","gene","Symbol","Name"))
name_col_2 <- pick_col(df2, c("GeneName","Preferred_name","gene","Symbol","Name"))
desc_col_1 <- pick_col(df1, c("Description","product","Product","annotation","Annot","note"))
desc_col_2 <- pick_col(df2, c("Description","product","Product","annotation","Annot","note"))
g1 <- unique(trimws(df1[[id_col_1]])); g1 <- g1[nzchar(g1)]
g2 <- unique(trimws(df2[[id_col_2]])); g2 <- g2[nzchar(g2)]
GOI_raw <- unique(c(g1, g2))
present <- intersect(mat_ids, GOI_raw)
if (!length(present)) {
present <- unique(mat_ids[strip_gene_prefix(mat_ids) %in% strip_gene_prefix(GOI_raw)])
}
if (!length(present)) next
getcol <- function(df, col, n) if (is.na(col)) rep("", n) else as.character(df[[col]])
plain1 <- strip_gene_prefix(as.character(df1[[id_col_1]]))
plain2 <- strip_gene_prefix(as.character(df2[[id_col_2]]))
nm1 <- getcol(df1, name_col_1, nrow(df1)); nm2 <- getcol(df2, name_col_2, nrow(df2))
ds1 <- getcol(df1, desc_col_1, nrow(df1)); ds2 <- getcol(df2, desc_col_2, nrow(df2))
nm1[is.na(nm1)] <- ""; nm2[is.na(nm2)] <- ""
ds1[is.na(ds1)] <- ""; ds2[is.na(ds2)] <- ""
keys_all <- unique(c(plain1, plain2))
id2name <- setNames(rep("", length(keys_all)), keys_all)
id2desc <- setNames(rep("", length(keys_all)), keys_all)
fill_map <- function(keys, vals, mp) {
for (i in seq_along(keys)) {
k <- keys[i]; v <- vals[i]
if (!nzchar(k)) next
if (!nzchar(mp[[k]]) && nzchar(v)) mp[[k]] <- v
}
mp
}
id2name <- fill_map(plain1, nm1, id2name); id2name <- fill_map(plain2, nm2, id2name)
id2desc <- fill_map(plain1, ds1, id2desc); id2desc <- fill_map(plain2, ds2, id2desc)
cond_tags <- sprintf(cond_order_template, tt)
keep_cols <- match_tags(colnames(expr_all), cond_tags)
if (!any(keep_cols)) next
sub_idx <- which(keep_cols)
sub_names <- colnames(expr_all)[sub_idx]
cond_for_col <- vapply(sub_names, detect_tag, character(1), tags = cond_tags)
cond_rank <- match(cond_for_col, cond_tags)
ord <- order(cond_rank, sub_names)
sub_idx <- sub_idx[ord]
expr_sub <- expr_all[present, sub_idx, drop = FALSE]
row_ok <- apply(expr_sub, 1, function(x) is.finite(sum(x)) && var(x, na.rm = TRUE) > 0)
datamat <- expr_sub[row_ok, , drop = FALSE]
if (nrow(datamat) < 2) next
hr <- hclust(as.dist(1 - cor(t(datamat), method = "pearson")), method = "complete")
mycl <- cutree(hr, h = max(hr$height) / 1.1)
palette_base <- c("yellow","blue","orange","magenta","cyan","red","green","maroon",
"lightblue","pink","purple","lightcyan","salmon","lightgreen")
mycol <- palette_base[(as.vector(mycl) - 1) %% length(palette_base) + 1]
labRow <- make_pretty_labels(rownames(datamat), id2name, id2desc)
labCol <- gsub("_", " ", colnames(datamat))
gene_id <- rownames(datamat)
gene_plain <- strip_gene_prefix(gene_id)
gene_name <- unname(id2name[gene_plain]); gene_name[is.na(gene_name)] <- ""
gene_desc <- unname(id2desc[gene_plain]); gene_desc[is.na(gene_desc)] <- ""
out_tbl <- data.frame(
GeneID = gene_id,
GeneID_plain = gene_plain,
GeneName = ifelse(nzchar(gene_name), gene_name, gene_plain),
Description = gene_desc,
datamat,
check.names = FALSE,
stringsAsFactors = FALSE
)
base <- sprintf("%s_%s_merged3", tt, tag)
out_xlsx <- file.path(out_dir, paste0("table_", base, ".xlsx"))
write.xlsx(out_tbl, out_xlsx, overwrite = TRUE)
out_png <- file.path(out_dir, paste0("heatmap_", base, ".png"))
cex_row <- if (nrow(datamat) > 600) 0.90 else if (nrow(datamat) > 300) 1.05 else 1.30
height <- max(1600, min(18000, 34 * nrow(datamat)))
png(out_png, width = 2200, height = height)
heatmap.2(
datamat,
Rowv = as.dendrogram(hr),
Colv = FALSE,
dendrogram = "row",
col = bluered(75),
scale = "row",
trace = "none",
density.info = "none",
RowSideColors = mycol,
margins = c(12, 60),
labRow = labRow,
labCol = labCol,
cexRow = cex_row,
cexCol = 2.0,
srtCol = 15,
key = FALSE
)
dev.off()
message("WROTE: ", out_png)
message("WROTE: ", out_xlsx)
}
}
message("Done. Output dir: ", out_dir)
Run it:
setwd("~/DATA/Data_JuliaFuchs_RNAseq_2025")
source("subset_heatmaps/draw_9_merged_heatmaps.R")
3.3 Optional: Plot 2-condition subset heatmaps (R)
#!/usr/bin/env Rscript
## =============================================================
## Draw 18 subset heatmaps using *_matched.tsv as input
## Output: subset_heatmaps/heatmaps_from_matched/
##
## Requirements:
## - rld or vsd exists in environment (DESeq2 transform)
## If running as Rscript, you must load/create rld/vsd BEFORE sourcing this file
## (see the note at the bottom for the "source()" way)
##
## Matched TSV must contain GeneID or GeneID_plain (or GeneName) columns.
## =============================================================
suppressPackageStartupMessages(library(gplots))
in_dir <- "subset_heatmaps"
out_dir <- file.path(in_dir, "heatmaps_from_matched")
dir.create(out_dir, showWarnings = FALSE, recursive = TRUE)
# -------------------------
# Helper functions
# -------------------------
pick_col <- function(df, candidates) {
hit <- intersect(candidates, names(df))
if (length(hit) == 0) return(NA_character_)
hit[1]
}
strip_gene_prefix <- function(x) sub("^gene[-_]", "", x)
split_contrast_groups <- function(x) {
parts <- strsplit(x, "_vs_", fixed = TRUE)[[1]]
if (length(parts) != 2L) stop("Contrast must be in form A_vs_B: ", x)
parts
}
match_tags <- function(nms, tags) {
pat <- paste0("(^|_)(?:", paste(tags, collapse = "|"), ")(_|$)")
grepl(pat, nms, perl = TRUE)
}
# -------------------------
# Get expression matrix
# -------------------------
if (exists("rld")) {
expr_all <- assay(rld)
} else if (exists("vsd")) {
expr_all <- assay(vsd)
} else {
stop("Neither 'rld' nor 'vsd' exists. Create/load it before running this script.")
}
expr_all <- as.matrix(expr_all)
mat_ids <- rownames(expr_all)
if (is.null(mat_ids)) stop("Expression matrix has no rownames.")
# -------------------------
# List your 18 matched inputs
# -------------------------
matched_files <- c(
"Mitomycin_4h_vs_Untreated_4h_A_phage_matched.tsv",
"Mitomycin_4h_vs_Untreated_4h_B_stress_matched.tsv",
"Mitomycin_4h_vs_Untreated_4h_C_biofilm_matched.tsv",
"Mitomycin_8h_vs_Untreated_8h_A_phage_matched.tsv",
"Mitomycin_8h_vs_Untreated_8h_B_stress_matched.tsv",
"Mitomycin_8h_vs_Untreated_8h_C_biofilm_matched.tsv",
"Mitomycin_18h_vs_Untreated_18h_A_phage_matched.tsv",
"Mitomycin_18h_vs_Untreated_18h_B_stress_matched.tsv",
"Mitomycin_18h_vs_Untreated_18h_C_biofilm_matched.tsv",
"Moxi_4h_vs_Untreated_4h_A_phage_matched.tsv",
"Moxi_4h_vs_Untreated_4h_B_stress_matched.tsv",
"Moxi_4h_vs_Untreated_4h_C_biofilm_matched.tsv",
"Moxi_8h_vs_Untreated_8h_A_phage_matched.tsv",
"Moxi_8h_vs_Untreated_8h_B_stress_matched.tsv",
"Moxi_8h_vs_Untreated_8h_C_biofilm_matched.tsv",
"Moxi_18h_vs_Untreated_18h_A_phage_matched.tsv",
"Moxi_18h_vs_Untreated_18h_B_stress_matched.tsv",
"Moxi_18h_vs_Untreated_18h_C_biofilm_matched.tsv"
)
matched_paths <- file.path(in_dir, matched_files)
# -------------------------
# Main loop
# -------------------------
for (path in matched_paths) {
if (!file.exists(path)) {
message("SKIP missing: ", path)
next
}
base <- sub("_matched\\.tsv$", "", basename(path))
# base looks like: Mitomycin_4h_vs_Untreated_4h_A_phage
# split base into contrast + tag (last 2 underscore fields are the tag)
parts <- strsplit(base, "_")[[1]]
if (length(parts) < 6) {
message("SKIP unexpected name: ", base)
next
}
# infer tag as last 2 parts: e.g. A_phage / B_stress / C_biofilm
tag <- paste0(parts[length(parts)-1], "_", parts[length(parts)])
# contrast is the rest
contrast <- paste(parts[1:(length(parts)-2)], collapse = "_")
# read matched TSV
df <- read.delim(path, sep = "\t", header = TRUE, stringsAsFactors = FALSE, check.names = FALSE)
id_col <- pick_col(df, c("GeneID", "GeneID_plain", "GeneName", "Gene_Id", "gene_id", "locus_tag", "LocusTag", "ID"))
if (is.na(id_col)) {
message("SKIP (no ID col): ", path)
next
}
GOI_raw <- unique(trimws(df[[id_col]]))
GOI_raw <- GOI_raw[nzchar(GOI_raw)]
# match GOI to matrix ids robustly
present <- intersect(mat_ids, GOI_raw)
if (!length(present)) {
present <- unique(mat_ids[strip_gene_prefix(mat_ids) %in% strip_gene_prefix(GOI_raw)])
}
if (!length(present)) {
message("SKIP (no GOI matched matrix): ", base)
next
}
# subset columns for the two groups
groups <- split_contrast_groups(contrast)
keep_cols <- match_tags(colnames(expr_all), groups)
if (!any(keep_cols)) {
message("SKIP (no columns matched groups): ", contrast)
next
}
cols_idx <- which(keep_cols)
sub_colnames <- colnames(expr_all)[cols_idx]
# put Untreated first (2nd group in "Treated_vs_Untreated")
ord <- order(!grepl(paste0("(^|_)", groups[2], "(_|$)"), sub_colnames, perl = TRUE))
cols_idx <- cols_idx[ord]
expr_sub <- expr_all[present, cols_idx, drop = FALSE]
# remove constant/NA rows
row_ok <- apply(expr_sub, 1, function(x) is.finite(sum(x)) && var(x, na.rm = TRUE) > 0)
datamat <- expr_sub[row_ok, , drop = FALSE]
if (nrow(datamat) < 2) {
message("SKIP (too few rows after filtering): ", base)
next
}
# clustering
hr <- hclust(as.dist(1 - cor(t(datamat), method = "pearson")), method = "complete")
mycl <- cutree(hr, h = max(hr$height) / 1.1)
palette_base <- c("yellow","blue","orange","magenta","cyan","red","green","maroon",
"lightblue","pink","purple","lightcyan","salmon","lightgreen")
mycol <- palette_base[(as.vector(mycl) - 1) %% length(palette_base) + 1]
# labels
labRow <- rownames(datamat)
labRow <- sub("^gene-", "", labRow)
labRow <- sub("^rna-", "", labRow)
labCol <- colnames(datamat)
labCol <- gsub("_", " ", labCol)
# output sizes
height <- max(900, min(12000, 25 * nrow(datamat)))
out_png <- file.path(out_dir, paste0("heatmap_", base, ".png"))
out_mat <- file.path(out_dir, paste0("matrix_", base, ".csv"))
write.csv(as.data.frame(datamat), out_mat, quote = FALSE)
png(out_png, width = 1100, height = height)
heatmap.2(
datamat,
Rowv = as.dendrogram(hr),
Colv = FALSE,
dendrogram = "row",
col = bluered(75),
scale = "row",
trace = "none",
density.info = "none",
RowSideColors = mycol,
margins = c(10, 15),
sepwidth = c(0, 0),
labRow = labRow,
labCol = labCol,
cexRow = if (nrow(datamat) > 500) 0.6 else 1.0,
cexCol = 1.7,
srtCol = 15,
lhei = c(0.01, 4),
lwid = c(0.5, 4),
key = FALSE
)
dev.off()
message("WROTE: ", out_png)
}
message("All done. Output dir: ", out_dir)
Run it:
setwd("~/DATA/Data_JuliaFuchs_RNAseq_2025")
source("subset_heatmaps/draw_18_heatmaps_from_matched.R")
4) Optional: Update README_Heatmap to support “GOI file OR no-cutoff”
If you still use the older README_Heatmap logic that expects *-up.id and *-down.id, replace the GOI-building block with this (single GOI list or whole CSV with no cutoff):
geneset_file <- NA_character_ # e.g. "subset_heatmaps/GOI_Mitomycin_4h_vs_Untreated_4h_A_phage_GeneID.id"
use_all_genes_no_cutoff <- FALSE
if (!is.na(geneset_file) && file.exists(geneset_file)) {
GOI <- read_ids_from_file(geneset_file)
} else if (isTRUE(use_all_genes_no_cutoff)) {
all_path <- file.path("./results/star_salmon/degenes", paste0(contrast, "-all_annotated.csv"))
ann <- read.csv(all_path, stringsAsFactors = FALSE, check.names = FALSE)
id_col <- if ("GeneID" %in% names(ann)) "GeneID" else if ("GeneID_plain" %in% names(ann)) "GeneID_plain" else NA_character_
if (is.na(id_col)) stop("No GeneID / GeneID_plain in: ", all_path)
GOI <- unique(trimws(gsub('"', "", ann[[id_col]])))
GOI <- GOI[nzchar(GOI)]
} else {
stop("Set geneset_file OR set use_all_genes_no_cutoff <- TRUE")
}
present <- intersect(rownames(RNASeq.NoCellLine), GOI)
if (!length(present)) stop("None of the GOI found in expression matrix rownames.")
GOI <- present
5) Script inventory (bash + python)
Bash
subset_heatmaps/run_subset_setup.sh
Python
subset_heatmaps/extract_cds_fasta.pysubset_heatmaps/blast_hits_to_geneset.pysubset_heatmaps/geneset_by_keywords.pysubset_heatmaps/make_goi_lists_batch.py
R
subset_heatmaps/draw_9_merged_heatmaps.Rsubset_heatmaps/draw_18_heatmaps_from_matched.R
This post is a lab wiki / GitHub README / methods note, every script referenced is included in full above (and can be copied into subset_heatmaps/ directly).
Bacterial WGS Pipeline (Isolate Genomes, Data_Tam_DNAseq_2026_Acinetobacter_harbinensis): nf-core/bacass → Assembly/QC → Annotation → AMR/Virulence → Core-Genome Phylogeny → ANI

This post is a standalone, reproducible record of the bacterial WGS pipeline I used (example sample: AN6). I’m keeping all command lines (as-run) so you can reuse the workflow for future projects. Wherever you see absolute paths, replace them with your own.
0) Prerequisites (what you need installed)
- Nextflow
- Docker (for nf-core/bacass
-profile docker) - Conda/Mamba
- CLI tools used later:
fastqc,spades.py,shovill,pigz,awk,seqkit,fastANI, plus R (for plotting), and the tools required by the provided scripts.
1) Download KmerFinder database
# Download the kmerfinder database: https://www.genomicepidemiology.org/services/ --> https://cge.food.dtu.dk/services/KmerFinder/ --> https://cge.food.dtu.dk/services/KmerFinder/etc/kmerfinder_db.tar.gz
# Download 20190108_kmerfinder_stable_dirs.tar.gz from https://zenodo.org/records/13447056
2) Run nf-core/bacass (Nextflow)
#--kmerfinderdb /path/to/kmerfinder/bacteria.tar.gz
#--kmerfinderdb /mnt/nvme1n1p1/REFs/kmerfinder_db.tar.gz
#--kmerfinderdb /mnt/nvme1n1p1/REFs/20190108_kmerfinder_stable_dirs.tar.gz
nextflow run nf-core/bacass -r 2.5.0 -profile docker \
--input samplesheet.tsv \
--outdir bacass_out \
--assembly_type long \
--kraken2db /mnt/nvme1n1p1/REFs/k2_standard_08_GB_20251015.tar.gz \
--kmerfinderdb /mnt/nvme1n1p1/REFs/kmerfinder/bacteria/ \
-resume
#SAVE bacass_out/Kmerfinder/kmerfinder_summary.csv to bacass_out/Kmerfinder/An6/An6_kmerfinder_results.xlsx
3) Assembly (AN6 example)
3.1 Link raw reads + run FastQC
ln -s ../X101SC25116512-Z01-J002/01.RawData/An6/An6_1.fq.gz An6_R1.fastq.gz
ln -s ../X101SC25116512-Z01-J002/01.RawData/An6/An6_2.fq.gz An6_R2.fastq.gz
mkdir fastqc_out
fastqc -t 4 raw_data/* -o fastqc_out/
mamba activate /home/jhuang/miniconda3/envs/bengal3_ac3
3.2 Trimming decision notes (kept as recorded)
For the AN6 data, it’s not better to run Trimmomatic first in most cases (adapters OK; per-tile failures are instrument/tile related and not “fixed” by trimming).
* **Adapters:** FastQC shows **Adapter Content = PASS** for both R1/R2. * **Overrepresented sequences:** none detected. * **Per-tile sequence quality:** **FAIL** (this is usually an instrument/tile effect; trimming adapters won’t “fix” it).Shovill: avoid pre-trimming (default read trimming already included unless disabled). SPAdes: trimming optional; try raw first, then trimmed if needed.
3.3 If you do need Trimmomatic (command kept)
# Paired-end trimming with Trimmomatic (Illumina-style)
# Adjust TRIMMOMATIC_JAR and ADAPTERS paths to your install.
TRIMMOMATIC_JAR=/path/to/trimmomatic.jar
ADAPTERS=/path/to/Trimmomatic/adapters/TruSeq3-PE.fa
java -jar "$TRIMMOMATIC_JAR" PE -threads 16 -phred33 \
An6_R1.fastq.gz An6_R2.fastq.gz \
An6_R1.trim.paired.fastq.gz An6_R1.trim.unpaired.fastq.gz \
An6_R2.trim.paired.fastq.gz An6_R2.trim.unpaired.fastq.gz \
ILLUMINACLIP:"$ADAPTERS":2:30:10 \
LEADING:3 TRAILING:3 \
SLIDINGWINDOW:4:20 \
MINLEN:50
What you feed into SPAdes/Shovill afterward:
-
Use the paired outputs:
An6_R1.trim.paired.fastq.gzAn6_R2.trim.paired.fastq.gz
- Optional: you can include unpaired reads in SPAdes, but many people skip them for isolate assemblies unless coverage is low.
If you want, I can also give the matching SPAdes command that includes unpaired reads (or the cleanest approach that ignores them).
spades.py \
-1 raw_data/An6_R1.fastq.gz \
-2 raw_data/An6_R2.fastq.gz \
--isolate \
-t 32 -m 250 \
-o spades_out
spades.py \
-1 raw_data/An6_R1.fastq.gz \
-2 raw_data/An6_R2.fastq.gz \
--careful \
-t 32 -m 250 \
-o spades_out_careful
Shovill (CHOSEN; default does read trimming unless you disable it):
shovill \
--R1 raw_data/An6_R1.fastq.gz \
--R2 raw_data/An6_R2.fastq.gz \
--outdir shovill_out \
--cpus 32 --ram 250 \
--depth 100
If you want to keep reads completely untrimmed in Shovill, add --noreadtrim.
4) Genome annotation — BV-BRC ComprehensiveGenomeAnalysis
* Use: https://www.bv-brc.org/app/ComprehensiveGenomeAnalysis
* Input: scaffolded results from bacass
* Purpose: comprehensive overview + annotation of the genome assembly.
5) Table 1 — summary of sequence data + genome features (env: gunc_env)
5.1 Environment prep + pipeline run (kept)
# Prepare environment and run the Table 1 (Summary of sequence data and genome features (env: gunc_env)) pipeline:
# activate the env that has openpyxl
mamba activate gunc_env
mamba install -n gunc_env -c conda-forge openpyxl -y
mamba deactivate
# STEP_1
ENV_NAME=gunc_env \
SAMPLE=AN6 \
ASM=shovill_out/contigs.fa \
R1=./X101SC25116512-Z01-J002/01.RawData/An6/An6_1.fq.gz \
R2=./X101SC25116512-Z01-J002/01.RawData/An6/An6_2.fq.gz \
./make_table1_pe.sh
# STEP_2
python export_table1_stats_to_excel_py36_compat.py \
--workdir table1_AN6_work \
--out Comprehensive_AN6.xlsx \
--max-rows 200000 \
--sample AN6
5.2 Manual calculations (kept)
#Manually For the items “Total number of reads sequenced” and “Mean read length (bp)”:
#Total number of reads sequenced 9,127,297 × 2
#Coverage depth (sequencing depth) 589.4×
pigz -dc X101SC25116512-Z01-J002/01.RawData/An6/An6_1.fq.gz | awk 'END{print NR/4}'
seqkit stats X101SC25116512-Z01-J002/01.RawData/An6/An6_1.fq.gz
#file format type num_seqs sum_len min_len avg_len max_len
#X101SC25116512-Z01-J002/01.RawData/An6/An6_1.fq.gz FASTQ DNA 15,929,405 2,389,410,750 150 150 150
5.3 Example metrics table snapshot (kept)
Metricsa Value
Genome size (bp) 3,012,410
Contig count (>= 500 bp) 41
Total number of reads sequenced 15,929,405 × 2
Coverage depth (sequencing depth) 1454.3×
Coarse consistency (%) 99.67
Fine consistency (%) 94.50
Completeness (%) 99.73
Contamination (%) 0.21
Contigs N50 (bp) 169,757
Contigs L50 4
Guanine-cytosine content (%) 41.14
Number of coding sequences (CDSs) 2,938
Number of tRNAs 69
Number of rRNAs 3
6) AMR / virulence screening (ABRicate workflows)
cp shovill_out/contigs.fa AN6.fasta
ENV_NAME=/home/jhuang/miniconda3/envs/bengal3_ac3 ASM=AN6.fasta SAMPLE=AN6 THREADS=32 ./run_resistome_virulome_dedup.sh #Default MINID=90 MINCOV=60
ENV_NAME=/home/jhuang/miniconda3/envs/bengal3_ac3 ASM=AN6.fasta SAMPLE=AN6 MINID=80 MINCOV=60 ./run_resistome_virulome_dedup.sh # 0 0 0 0
ENV_NAME=/home/jhuang/miniconda3/envs/bengal3_ac3 ASM=AN6.fasta SAMPLE=AN6 MINID=70 MINCOV=50 ./run_resistome_virulome_dedup.sh # 5 5 0 4
#Sanity checks on ABRicate outputs
grep -vc '^#' resistome_virulence_AN6/raw/AN6.megares.tab
grep -vc '^#' resistome_virulence_AN6/raw/AN6.card.tab
grep -vc '^#' resistome_virulence_AN6/raw/AN6.resfinder.tab
grep -vc '^#' resistome_virulence_AN6/raw/AN6.vfdb.tab
#!!!!!! DEBUG_TOMORROW: why using 'MINID=70 MINCOV=50' didn't return the 5504?
#Dedup tables / “one per gene” mode
rm Resistome_Virulence_An6.xlsx
chmod +x run_abricate_resistome_virulome_one_per_gene.sh
ENV_NAME=/home/jhuang/miniconda3/envs/bengal3_ac3 \
ASM=AN6.fasta \
SAMPLE=AN6 \
OUTDIR=resistome_virulence_AN6 \
MINID=70 MINCOV=50 \
THREADS=32 \
./run_abricate_resistome_virulome_one_per_gene.sh
cd resistome_virulence_AN6
python3 -c 'import pandas as pd; from pathlib import Path; files=["Table_AMR_genes_dedup.tsv","Table_AMR_genes_one_per_gene.tsv","Table_Virulence_VFDB_dedup.tsv","Table_DB_hit_counts.tsv"]; out="AN6_resistome_virulence.xlsx"; w=pd.ExcelWriter(out, engine="openpyxl"); [pd.read_csv(f, sep="\t").to_excel(w, sheet_name=Path(f).stem[:31], index=False) for f in files]; w.close(); print(out)'
7) Core-genome phylogeny (NCBI + Roary + RAxML-NG + R plotting)
#Generate targets.tsv from ./bvbrc_out/Acinetobacter_harbinensis_AN6/FullGenomeReport.html.
export NCBI_EMAIL="xxx@yyy.de"
./resolve_best_assemblies_entrez.py targets.tsv resolved_accessions.tsv
#[OK] Acinetobacter_harbinensis_HITLi7 -> GCF_000816495.1 (Scaffold)
#[OK] Acinetobacter_sp._ANC -> GCF_965200015.1 (Complete Genome)
#[OK] Acinetobacter_sp._TTH0-4 -> GCF_965200015.1 (Complete Genome)
#[OK] Acinetobacter_tandoii_DSM_14970 -> GCF_000621065.1 (Scaffold)
#[OK] Acinetobacter_towneri_DSM_14962 -> GCF_000368785.1 (Scaffold)
#[OK] Acinetobacter_radioresistens_SH164 -> GCF_000162115.1 (Scaffold)
#[OK] Acinetobacter_radioresistens_SK82 -> GCF_000175675.1 (Contig)
#[OK] Acinetobacter_radioresistens_DSM_6976 -> GCF_000368905.1 (Scaffold)
#[OK] Acinetobacter_indicus_ANC -> GCF_000413875.1 (Scaffold)
#[OK] Acinetobacter_indicus_CIP_110367 -> GCF_000488255.1 (Scaffold)
#NOTE the env bengal3_ac3 don’t have the following R package, using r_env for the plot-step → RUN TWICE, first bengal3_ac3, then run build_wgs_tree_fig3B.sh plot-only.
#ADAPT the params EXTRA_ASSEMBLIES (could stay as empty), and AN6.fasta as REF_FASTA
conda activate /home/jhuang/miniconda3/envs/bengal3_ac3
export NCBI_EMAIL="xxx@yyy.de"
ENV_NAME=/home/jhuang/miniconda3/envs/bengal3_ac3 ./build_wgs_tree_fig3B.sh
# (Optional) if want to delete some leaves from the tree, remove from inputs so Roary cannot include it
for id in "GCF_002291425.1" "GCF_047901425.1" "GCF_004342245.1" "GCA_032062225.1"; do
rm -f work_wgs_tree/gffs/${id}.gff
rm -f work_wgs_tree/fastas/${id}.fna
rm -rf work_wgs_tree/prokka/${id}
rm -rf work_wgs_tree/genomes_ncbi/${id}
# remove from accession list so it won't come back
awk -F'\t' 'NR==1 || $2!="${id}"' work_wgs_tree/meta/accessions.tsv > work_wgs_tree/meta/accessions.tsv.tmp \
&& mv work_wgs_tree/meta/accessions.tsv.tmp work_wgs_tree/meta/accessions.tsv
done
./build_wgs_tree_fig3B.sh
#Wrote: work_wgs_tree/plot/labels.tsv
#Error: package or namespace load failed for ‘ggtree’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
#there is no package called ‘aplot’
#Execution halted --> Using env r_env instead (see below)!
# Run this to regenerate labels.tsv
bash regenerate_labels.sh
# Regenerate the plot --> ERROR --> Using Rscript instead (see below)!
ENV_NAME=/home/jhuang/mambaforge/envs/r_env ./build_wgs_tree_fig3B.sh plot-only
#-->Error in as.hclust.phylo(tr) : the tree is not ultrametric
# 8) Manual correct the display name in work_wgs_tree/plot/labels.tsv
#sample display
#GCF_000816495.1 Acinetobacter harbinensis HITLi7 (GCF_000816495.1)
#GCF_965200015.1 Acinetobacter sp. ANC (GCF_965200015.1)
#GCF_000621065.1 Acinetobacter tandoii DSM 14970 (GCF_000621065.1)
#GCF_000368785.1 Acinetobacter towneri DSM 14962 (GCF_000368785.1)
#GCF_000162115.1 Acinetobacter radioresistens SH164 (GCF_000162115.1)
#GCF_000175675.1 Acinetobacter radioresistens SK82 (GCF_000175675.1)
#GCF_000368905.1 Acinetobacter radioresistens DSM 6976 (GCF_000368905.1)
#GCF_000413875.1 Acinetobacter indicus ANC (GCF_000413875.1)
#GCF_000488255.1 Acinetobacter indicus CIP 110367 (GCF_000488255.1)
#REF AN6
# 9) Rerun only the plot step uisng plot_tree_v4.R
Rscript ./plot_tree_v4.R \
work_wgs_tree/raxmlng/core.raxml.support \
work_wgs_tree/plot/labels.tsv \
6 \
work_wgs_tree/plot/core_tree.pdf \
work_wgs_tree/plot/core_tree.png
8) ANI confirmation (fastANI loop)
mamba activate /home/jhuang/miniconda3/envs/bengal3_ac3
for id in GCF_000621065.1.fna GCF_000368785.1.fna GCF_000175675.1.fna GCF_000368905.1.fna GCF_000816495.1.fna GCF_965200015.1.fna GCF_000488255.1.fna GCF_000413875.1.fna GCF_000162115.1.fna; do
fastANI -q AN6.fasta -r ./work_wgs_tree/fastas/${id} -o fastANI_AN6_vs_${id}.txt
done
# Alternatively, we can use the script run_fastani_batch_verbose.sh.
9) Contig-to-reference mapping (how many contigs map?)
In total, we obtained 41 contigs >500 nt. Of these, 36 contigs were scaffolded with Multi-CSAR v1.1 into three chromosomal scaffolds:
- SCF_1: 1,773,912 bp
- SCF_2: 1,197,749 bp
- SCF_3: 23,925 bp Total: 2,995,586 bp
The remaining five contigs (contig00026/32/33/37/39) could not be scaffolded. Their partial BLASTn matches to both plasmid and chromosomal sequences suggest shared mobile elements, but do not confirm circular plasmids. A sequence/assembly summary was exported to Excel (Summary_AN6.xlsx), including read yield/read-length statistics and key assembly/QC metrics (genome size, contigs/scaffolds, N50, GC%, completeness, contamination).
Complete scripts (as attached)
Below are the full scripts exactly as provided, including plot_tree_v4.R.
make_table1_pe.sh
#!/usr/bin/env bash
set -Eeuo pipefail
# =========================
# User config
ENV_NAME="${ENV_NAME:-checkm_env2}"
# If you have Illumina paired-end, set R1/R2 (recommended)
R1="${R1:-}"
R2="${R2:-}"
# If you have single-end/ONT-like reads, set READS instead (legacy mode)
READS="${READS:-}"
ASM="${ASM:-shovill_out/contigs.fa}"
SAMPLE="${SAMPLE:-An6}"
THREADS="${THREADS:-32}"
OUT_TSV="${OUT_TSV:-Table1_${SAMPLE}.tsv}"
WORKDIR="${WORKDIR:-table1_${SAMPLE}_work}"
LOGDIR="${LOGDIR:-${WORKDIR}/logs}"
LOGFILE="${LOGFILE:-${LOGDIR}/run_$(date +%F_%H%M%S).log}"
AUTO_INSTALL="${AUTO_INSTALL:-1}" # 1=install missing tools in ENV_NAME
GUNC_DB_KIND="${GUNC_DB_KIND:-progenomes}" # progenomes or gtdb
# =========================
mkdir -p "${LOGDIR}"
exec > >(tee -a "${LOGFILE}") 2>&1
ts(){ date +"%F %T"; }
log(){ echo "[$(ts)] $*"; }
on_err() {
local ec=$?
log "ERROR: failed (exit=${ec}) at line ${BASH_LINENO[0]}: ${BASH_COMMAND}"
log "Logfile: ${LOGFILE}"
exit "${ec}"
}
trap on_err ERR
# print every command
set -x
need_cmd(){ command -v "$1" >/dev/null 2>&1; }
pick_pm() {
if need_cmd mamba; then echo "mamba"
elif need_cmd conda; then echo "conda"
else
log "ERROR: neither mamba nor conda found in PATH"
exit 1
fi
}
activate_env() {
if ! need_cmd conda; then
log "ERROR: conda not found; cannot activate env"
exit 1
fi
# shellcheck disable=SC1091
source "$(conda info --base)/etc/profile.d/conda.sh"
conda activate "${ENV_NAME}"
}
ensure_env_exists() {
# shellcheck disable=SC1091
source "$(conda info --base)/etc/profile.d/conda.sh"
if ! conda env list | awk '{print $1}' | grep -qx "${ENV_NAME}"; then
log "ERROR: env ${ENV_NAME} not found. Create it first."
exit 1
fi
}
install_pkgs_in_env() {
local pm="$1"; shift
local pkgs=("$@")
log "Installing into env ${ENV_NAME}: ${pkgs[*]}"
"${pm}" install -n "${ENV_NAME}" -c bioconda -c conda-forge -y "${pkgs[@]}"
}
pick_quast_cmd() {
if need_cmd quast; then echo "quast"
elif need_cmd quast.py; then echo "quast.py"
else echo ""
fi
}
# tool->package mapping (install missing ones)
declare -A TOOL2PKG=(
[quast]="quast"
[minimap2]="minimap2"
[samtools]="samtools"
[mosdepth]="mosdepth"
[checkm]="checkm-genome=1.1.3"
[gunc]="gunc"
[python]="python"
)
# =========================
# Detect mode (PE vs single)
MODE=""
if [[ -n "${R1}" || -n "${R2}" ]]; then
[[ -n "${R1}" && -n "${R2}" ]] || { log "ERROR: Provide both R1 and R2."; exit 1; }
MODE="PE"
elif [[ -n "${READS}" ]]; then
MODE="SINGLE"
else
log "ERROR: Provide either (R1+R2) OR READS."
exit 1
fi
# =========================
# Start
log "Start: Table 1 generation (reuse env=${ENV_NAME})"
log "Assembly: ${ASM}"
log "Sample: ${SAMPLE}"
log "Threads: ${THREADS}"
log "Workdir: ${WORKDIR}"
log "Logfile: ${LOGFILE}"
log "Mode: ${MODE}"
if [[ "${MODE}" == "PE" ]]; then
log "R1: ${R1}"
log "R2: ${R2}"
else
log "Reads: ${READS}"
fi
PM="$(pick_pm)"
log "Pkg manager: ${PM}"
ensure_env_exists
activate_env
log "Active envs:"
conda info --envs
log "Versions (if available):"
( python --version || true )
( checkm --version || true )
( gunc -v || true )
( minimap2 --version 2>&1 | head -n 2 || true )
( samtools --version 2>&1 | head -n 2 || true )
( mosdepth --version 2>&1 | head -n 2 || true )
( quast --version 2>&1 | head -n 2 || true )
( quast.py --version 2>&1 | head -n 2 || true )
# =========================
# Check/install missing tools in this env
MISSING_PKGS=()
for tool in minimap2 samtools mosdepth checkm gunc python; do
if ! need_cmd "${tool}"; then
MISSING_PKGS+=("${TOOL2PKG[$tool]}")
fi
done
QUAST_CMD="$(pick_quast_cmd)"
if [[ -z "${QUAST_CMD}" ]]; then
MISSING_PKGS+=("${TOOL2PKG[quast]}")
fi
if [[ "${#MISSING_PKGS[@]}" -gt 0 ]]; then
if [[ "${AUTO_INSTALL}" != "1" ]]; then
log "ERROR: missing tools and AUTO_INSTALL=0. Missing packages: ${MISSING_PKGS[*]}"
exit 1
fi
mapfile -t UNIQUE < <(printf "%s\n" "${MISSING_PKGS[@]}" | awk '!seen[$0]++')
install_pkgs_in_env "${PM}" "${UNIQUE[@]}"
activate_env
QUAST_CMD="$(pick_quast_cmd)"
fi
for tool in minimap2 samtools mosdepth checkm gunc python; do
need_cmd "${tool}" || { log "ERROR: still missing tool: ${tool}"; exit 1; }
done
[[ -n "${QUAST_CMD}" ]] || { log "ERROR: QUAST still missing."; exit 1; }
log "All tools ready. QUAST cmd: ${QUAST_CMD}"
# =========================
# Prepare workdir
mkdir -p "${WORKDIR}"/{genomes,reads,stats,quast,map,checkm,gunc,tmp}
ASM_ABS="$(realpath "${ASM}")"
ln -sf "${ASM_ABS}" "${WORKDIR}/genomes/${SAMPLE}.fasta"
if [[ "${MODE}" == "PE" ]]; then
R1_ABS="$(realpath "${R1}")"
R2_ABS="$(realpath "${R2}")"
ln -sf "${R1_ABS}" "${WORKDIR}/reads/${SAMPLE}.R1.fastq.gz"
ln -sf "${R2_ABS}" "${WORKDIR}/reads/${SAMPLE}.R2.fastq.gz"
else
READS_ABS="$(realpath "${READS}")"
ln -sf "${READS_ABS}" "${WORKDIR}/reads/${SAMPLE}.reads.fastq.gz"
fi
# =========================
# 1) QUAST
log "Run QUAST..."
"${QUAST_CMD}" "${WORKDIR}/genomes/${SAMPLE}.fasta" -o "${WORKDIR}/quast"
QUAST_TSV="${WORKDIR}/quast/report.tsv"
test -s "${QUAST_TSV}"
# =========================
# 2) Map reads + mosdepth
log "Map reads (minimap2) + sort BAM..."
SORT_T="$((THREADS>16?16:THREADS))"
if [[ "${MODE}" == "PE" ]]; then
minimap2 -t "${THREADS}" -ax sr \
"${WORKDIR}/genomes/${SAMPLE}.fasta" \
"${WORKDIR}/reads/${SAMPLE}.R1.fastq.gz" "${WORKDIR}/reads/${SAMPLE}.R2.fastq.gz" \
| samtools sort -@ "${SORT_T}" -o "${WORKDIR}/map/${SAMPLE}.bam" -
else
# legacy single-read mode; keep map-ont as in original script
minimap2 -t "${THREADS}" -ax map-ont \
"${WORKDIR}/genomes/${SAMPLE}.fasta" "${WORKDIR}/reads/${SAMPLE}.reads.fastq.gz" \
| samtools sort -@ "${SORT_T}" -o "${WORKDIR}/map/${SAMPLE}.bam" -
fi
samtools index "${WORKDIR}/map/${SAMPLE}.bam"
log "Compute depth (mosdepth)..."
mosdepth -t "${SORT_T}" "${WORKDIR}/map/${SAMPLE}" "${WORKDIR}/map/${SAMPLE}.bam"
MOS_SUMMARY="${WORKDIR}/map/${SAMPLE}.mosdepth.summary.txt"
test -s "${MOS_SUMMARY}"
# =========================
# 3) CheckM
log "Run CheckM lineage_wf..."
checkm lineage_wf -x fasta -t "${THREADS}" "${WORKDIR}/genomes" "${WORKDIR}/checkm/out"
log "Run CheckM qa..."
checkm qa "${WORKDIR}/checkm/out/lineage.ms" "${WORKDIR}/checkm/out" --tab_table -o 2 \
> "${WORKDIR}/checkm/checkm_summary.tsv"
CHECKM_SUM="${WORKDIR}/checkm/checkm_summary.tsv"
test -s "${CHECKM_SUM}"
# =========================
# 4) GUNC
log "Run GUNC..."
mkdir -p "${WORKDIR}/gunc/db" "${WORKDIR}/gunc/out"
if [[ -z "$(ls -A "${WORKDIR}/gunc/db" 2>/dev/null || true)" ]]; then
log "Downloading GUNC DB kind=${GUNC_DB_KIND} to ${WORKDIR}/gunc/db ..."
gunc download_db -db "${GUNC_DB_KIND}" "${WORKDIR}/gunc/db"
fi
DMND="$(find "${WORKDIR}/gunc/db" -type f -name "*.dmnd" | head -n 1 || true)"
if [[ -z "${DMND}" ]]; then
log "ERROR: No *.dmnd found under ${WORKDIR}/gunc/db after download."
ls -lah "${WORKDIR}/gunc/db" || true
exit 1
fi
log "Using GUNC db_file: ${DMND}"
gunc run \
--db_file "${DMND}" \
--input_fasta "${WORKDIR}/genomes/${SAMPLE}.fasta" \
--out_dir "${WORKDIR}/gunc/out" \
--threads "${THREADS}" \
--detailed_output \
--contig_taxonomy_output \
--use_species_level
ALL_LEVELS="$(find "${WORKDIR}/gunc/out" -name "*all_levels.tsv" | head -n 1 || true)"
test -n "${ALL_LEVELS}"
log "Found GUNC all_levels.tsv: ${ALL_LEVELS}"
# =========================
# 5) Parse outputs and write Table 1 TSV
log "Parse outputs → ${OUT_TSV}"
export SAMPLE WORKDIR OUT_TSV GUNC_ALL_LEVELS="${ALL_LEVELS}"
python - <<'PY'
import csv, os
sample = os.environ["SAMPLE"]
workdir = os.environ["WORKDIR"]
out_tsv = os.environ["OUT_TSV"]
gunc_all_levels = os.environ["GUNC_ALL_LEVELS"]
quast_tsv = os.path.join(workdir, "quast", "report.tsv")
mos_summary = os.path.join(workdir, "map", f"{sample}.mosdepth.summary.txt")
checkm_sum = os.path.join(workdir, "checkm", "checkm_summary.tsv")
def read_quast(path):
with open(path, newline="") as f:
rows = list(csv.reader(f, delimiter="\t"))
asm_idx = 1
d = {}
for r in rows[1:]:
if not r: continue
key = r[0].strip()
val = r[asm_idx].strip() if asm_idx < len(r) else ""
d[key] = val
return d
def read_mosdepth(path):
with open(path) as f:
for line in f:
if line.startswith("chrom"): continue
parts = line.rstrip("\n").split("\t")
if len(parts) >= 4 and parts[0] == "total":
return parts[3]
return ""
def read_checkm(path, sample):
with open(path, newline="") as f:
reader = csv.DictReader(f, delimiter="\t")
for row in reader:
bid = row.get("Bin Id") or row.get("Bin") or row.get("bin_id") or ""
if bid == sample:
return row
return {}
def read_gunc_all_levels(path):
coarse_lvls = {"kingdom","phylum","class"}
fine_lvls = {"order","family","genus","species"}
coarse, fine = [], []
best_line = None
rank = {"kingdom":0,"phylum":1,"class":2,"order":3,"family":4,"genus":5,"species":6}
best_rank = -1
with open(path, newline="") as f:
reader = csv.DictReader(f, delimiter="\t")
for row in reader:
lvl = (row.get("taxonomic_level") or "").strip()
p = row.get("proportion_genes_retained_in_major_clades") or ""
try:
pv = float(p)
except:
pv = None
if pv is not None:
if lvl in coarse_lvls: coarse.append(pv)
if lvl in fine_lvls: fine.append(pv)
if lvl in rank and rank[lvl] > best_rank:
best_rank = rank[lvl]
best_line = row
coarse_mean = sum(coarse)/len(coarse) if coarse else ""
fine_mean = sum(fine)/len(fine) if fine else ""
contamination_portion = best_line.get("contamination_portion","") if best_line else ""
pass_gunc = best_line.get("pass.GUNC","") if best_line else ""
return coarse_mean, fine_mean, contamination_portion, pass_gunc
qu = read_quast(quast_tsv)
mean_depth = read_mosdepth(mos_summary)
ck = read_checkm(checkm_sum, sample)
coarse_mean, fine_mean, contamination_portion, pass_gunc = read_gunc_all_levels(gunc_all_levels)
header = [
"Sample",
"Genome_length_bp",
"Contigs",
"N50_bp",
"L50",
"GC_percent",
"Mean_depth_x",
"CheckM_completeness_percent",
"CheckM_contamination_percent",
"CheckM_strain_heterogeneity_percent",
"GUNC_coarse_consistency",
"GUNC_fine_consistency",
"GUNC_contamination_portion",
"GUNC_pass"
]
row = [
sample,
qu.get("Total length", ""),
qu.get("# contigs", ""),
qu.get("N50", ""),
qu.get("L50", ""),
qu.get("GC (%)", ""),
mean_depth,
ck.get("Completeness", ""),
ck.get("Contamination", ""),
ck.get("Strain heterogeneity", ""),
f"{coarse_mean:.4f}" if isinstance(coarse_mean, float) else coarse_mean,
f"{fine_mean:.4f}" if isinstance(fine_mean, float) else fine_mean,
contamination_portion,
pass_gunc
]
with open(out_tsv, "w", newline="") as f:
w = csv.writer(f, delimiter="\t")
w.writerow(header)
w.writerow(row)
print(f"OK: wrote {out_tsv}")
PY
log "SUCCESS"
log "Output TSV: ${OUT_TSV}"
log "Workdir: ${WORKDIR}"
log "Logfile: ${LOGFILE}"
export_table1_stats_to_excel_py36_compat.py
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Export a comprehensive Excel workbook from a Table1 pipeline workdir.
Python 3.6 compatible (no PEP604 unions, no builtin generics).
Requires: openpyxl
Sheets (as available):
- Summary
- Table1 (if Table1_*.tsv exists)
- QUAST_report (report.tsv)
- QUAST_metrics (metric/value)
- Mosdepth_summary (*.mosdepth.summary.txt)
- CheckM (checkm_summary.tsv)
- GUNC_* (all .tsv under gunc/out)
- File_Inventory (relative path, size, mtime; optional md5 for small files)
- Run_log_preview (head/tail of latest log under workdir/logs or workdir/*/logs)
"""
from __future__ import print_function
import argparse
import csv
import hashlib
import os
import sys
import time
from pathlib import Path
try:
from openpyxl import Workbook
from openpyxl.utils import get_column_letter
except ImportError:
sys.stderr.write("ERROR: openpyxl is required. Install with:\n"
" conda install -c conda-forge openpyxl\n")
raise
MAX_XLSX_ROWS = 1048576
def safe_sheet_name(name, used):
# Excel: <=31 chars, cannot contain: : \ / ? * [ ]
bad = r'[:\\/?*\[\]]'
base = name.strip() or "Sheet"
base = __import__("re").sub(bad, "_", base)
base = base[:31]
if base not in used:
used.add(base)
return base
# make unique with suffix
for i in range(2, 1000):
suffix = "_%d" % i
cut = 31 - len(suffix)
candidate = (base[:cut] + suffix)
if candidate not in used:
used.add(candidate)
return candidate
raise RuntimeError("Too many duplicate sheet names for base=%s" % base)
def autosize(ws, max_width=60):
for col in ws.columns:
max_len = 0
col_letter = get_column_letter(col[0].column)
for cell in col:
v = cell.value
if v is None:
continue
s = str(v)
if len(s) > max_len:
max_len = len(s)
ws.column_dimensions[col_letter].width = min(max_width, max(10, max_len + 2))
def write_table(ws, header, rows, max_rows=None):
if header:
ws.append(header)
count = 0
for r in rows:
ws.append(r)
count += 1
if max_rows is not None and count >= max_rows:
break
def read_tsv(path, max_rows=None):
header = []
rows = []
with path.open("r", newline="") as f:
reader = csv.reader(f, delimiter="\t")
for i, r in enumerate(reader):
if i == 0:
header = r
continue
rows.append(r)
if max_rows is not None and len(rows) >= max_rows:
break
return header, rows
def read_text_table(path, max_rows=None):
# for mosdepth summary (tsv with header)
return read_tsv(path, max_rows=max_rows)
def md5_file(path, chunk=1024*1024):
h = hashlib.md5()
with path.open("rb") as f:
while True:
b = f.read(chunk)
if not b:
break
h.update(b)
return h.hexdigest()
def find_latest_log(workdir):
candidates = []
# common locations
for p in [workdir / "logs", workdir / "log", workdir / "Logs"]:
if p.exists():
candidates.extend(p.glob("*.log"))
# nested logs
candidates.extend(workdir.glob("**/logs/*.log"))
if not candidates:
return None
candidates.sort(key=lambda x: x.stat().st_mtime, reverse=True)
return candidates[0]
def add_summary_sheet(wb, used, info_items):
ws = wb.create_sheet(title=safe_sheet_name("Summary", used))
ws.append(["Key", "Value"])
for k, v in info_items:
ws.append([k, v])
autosize(ws)
def add_log_preview(wb, used, log_path, head_n=80, tail_n=120):
if log_path is None or not log_path.exists():
return
ws = wb.create_sheet(title=safe_sheet_name("Run_log_preview", used))
ws.append(["Log path", str(log_path)])
ws.append([])
lines = log_path.read_text(errors="replace").splitlines()
ws.append(["--- HEAD (%d) ---" % head_n])
for line in lines[:head_n]:
ws.append([line])
ws.append([])
ws.append(["--- TAIL (%d) ---" % tail_n])
for line in lines[-tail_n:]:
ws.append([line])
ws.column_dimensions["A"].width = 120
def add_file_inventory(wb, used, workdir, do_md5=True, md5_max_bytes=200*1024*1024, max_rows=None):
ws = wb.create_sheet(title=safe_sheet_name("File_Inventory", used))
ws.append(["relative_path", "size_bytes", "mtime_iso", "md5(optional)"])
count = 0
for p in sorted(workdir.rglob("*")):
if p.is_dir():
continue
rel = str(p.relative_to(workdir))
st = p.stat()
mtime = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(st.st_mtime))
md5 = ""
if do_md5 and st.st_size <= md5_max_bytes:
try:
md5 = md5_file(p)
except Exception:
md5 = "ERROR"
ws.append([rel, st.st_size, mtime, md5])
count += 1
if max_rows is not None and count >= max_rows:
break
autosize(ws, max_width=80)
def add_tsv_sheet(wb, used, name, path, max_rows=None):
header, rows = read_tsv(path, max_rows=max_rows)
ws = wb.create_sheet(title=safe_sheet_name(name, used))
write_table(ws, header, rows, max_rows=max_rows)
autosize(ws, max_width=80)
def add_quast_metrics_sheet(wb, used, quast_report_tsv):
header, rows = read_tsv(quast_report_tsv, max_rows=None)
if not header or len(header) < 2:
return
asm_name = header[1]
ws = wb.create_sheet(title=safe_sheet_name("QUAST_metrics", used))
ws.append(["Metric", asm_name])
for r in rows:
if not r:
continue
metric = r[0]
val = r[1] if len(r) > 1 else ""
ws.append([metric, val])
autosize(ws, max_width=80)
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--workdir", required=True, help="workdir produced by pipeline (e.g., table1_GE11174_work)")
ap.add_argument("--out", required=True, help="output .xlsx")
ap.add_argument("--sample", default="", help="sample name for summary")
ap.add_argument("--max-rows", type=int, default=200000, help="max rows per large sheet")
ap.add_argument("--no-md5", action="store_true", help="skip md5 calculation in File_Inventory")
args = ap.parse_args()
workdir = Path(args.workdir).resolve()
out = Path(args.out).resolve()
if not workdir.exists():
sys.stderr.write("ERROR: workdir not found: %s\n" % workdir)
sys.exit(2)
wb = Workbook()
# remove default sheet
wb.remove(wb.active)
used = set()
# Summary info
info = [
("sample", args.sample or ""),
("workdir", str(workdir)),
("generated_at", time.strftime("%Y-%m-%d %H:%M:%S")),
("python", sys.version.replace("\n", " ")),
("openpyxl", __import__("openpyxl").__version__),
]
add_summary_sheet(wb, used, info)
# Table1 TSV (try common names)
table1_candidates = list(workdir.glob("Table1_*.tsv")) + list(workdir.glob("*.tsv"))
# Prefer Table1_*.tsv in workdir root
table1_path = None
for p in table1_candidates:
if p.name.startswith("Table1_") and p.suffix == ".tsv":
table1_path = p
break
if table1_path is None:
# maybe created in cwd, not inside workdir; try alongside workdir
parent = workdir.parent
for p in parent.glob("Table1_*.tsv"):
if args.sample and args.sample in p.name:
table1_path = p
break
if table1_path is None and list(parent.glob("Table1_*.tsv")):
table1_path = sorted(parent.glob("Table1_*.tsv"))[0]
if table1_path is not None and table1_path.exists():
add_tsv_sheet(wb, used, "Table1", table1_path, max_rows=args.max_rows)
# QUAST
quast_report = workdir / "quast" / "report.tsv"
if quast_report.exists():
add_tsv_sheet(wb, used, "QUAST_report", quast_report, max_rows=args.max_rows)
add_quast_metrics_sheet(wb, used, quast_report)
# Mosdepth summary
for p in sorted((workdir / "map").glob("*.mosdepth.summary.txt")):
# mosdepth summary is TSV-like
name = "Mosdepth_" + p.stem.replace(".mosdepth.summary", "")
add_tsv_sheet(wb, used, name[:31], p, max_rows=args.max_rows)
# CheckM
checkm_sum = workdir / "checkm" / "checkm_summary.tsv"
if checkm_sum.exists():
add_tsv_sheet(wb, used, "CheckM", checkm_sum, max_rows=args.max_rows)
# GUNC outputs (all TSV under gunc/out)
gunc_out = workdir / "gunc" / "out"
if gunc_out.exists():
for p in sorted(gunc_out.rglob("*.tsv")):
rel = str(p.relative_to(gunc_out))
sheet = "GUNC_" + rel.replace("/", "_").replace("\\", "_").replace(".tsv", "")
add_tsv_sheet(wb, used, sheet[:31], p, max_rows=args.max_rows)
# Log preview
latest_log = find_latest_log(workdir)
add_log_preview(wb, used, latest_log)
# File inventory
add_file_inventory(
wb, used, workdir,
do_md5=(not args.no_md5),
md5_max_bytes=200*1024*1024,
max_rows=args.max_rows
)
# Save
out.parent.mkdir(parents=True, exist_ok=True)
wb.save(str(out))
print("OK: wrote %s" % out)
if __name__ == "__main__":
main()
run_resistome_virulome_dedup.sh
#!/usr/bin/env bash
set -Eeuo pipefail
# -------- user inputs --------
ENV_NAME="${ENV_NAME:-bengal3_ac3}"
ASM="${ASM:-GE11174.fasta}"
SAMPLE="${SAMPLE:-GE11174}"
OUTDIR="${OUTDIR:-resistome_virulence_${SAMPLE}}"
THREADS="${THREADS:-16}"
# thresholds (set to 0/0 if you truly want ABRicate defaults)
MINID="${MINID:-90}"
MINCOV="${MINCOV:-60}"
# ----------------------------
log(){ echo "[$(date +'%F %T')] $*" >&2; }
need_cmd(){ command -v "$1" >/dev/null 2>&1; }
activate_env() {
# shellcheck disable=SC1091
source "$(conda info --base)/etc/profile.d/conda.sh"
conda activate "${ENV_NAME}"
}
main(){
activate_env
mkdir -p "${OUTDIR}"/{raw,amr,virulence,card,tmp}
log "Env: ${ENV_NAME}"
log "ASM: ${ASM}"
log "Sample: ${SAMPLE}"
log "Outdir: ${OUTDIR}"
log "ABRicate thresholds: MINID=${MINID} MINCOV=${MINCOV}"
log "ABRicate DB list:"
abricate --list | egrep -i "vfdb|resfinder|megares|card" || true
# Make sure indices exist
log "Running abricate --setupdb (safe even if already done)..."
abricate --setupdb
# ---- ABRicate AMR DBs ----
log "Running ABRicate: ResFinder"
abricate --db resfinder --minid "${MINID}" --mincov "${MINCOV}" "${ASM}" > "${OUTDIR}/raw/${SAMPLE}.resfinder.tab"
log "Running ABRicate: MEGARes"
abricate --db megares --minid "${MINID}" --mincov "${MINCOV}" "${ASM}" > "${OUTDIR}/raw/${SAMPLE}.megares.tab"
# ---- Virulence (VFDB) ----
log "Running ABRicate: VFDB"
abricate --db vfdb --minid "${MINID}" --mincov "${MINCOV}" "${ASM}" > "${OUTDIR}/raw/${SAMPLE}.vfdb.tab"
# ---- CARD: prefer RGI if available, else ABRicate card ----
CARD_MODE="ABRicate"
if need_cmd rgi; then
log "RGI found. Trying RGI (CARD) ..."
set +e
rgi main --input_sequence "${ASM}" --output_file "${OUTDIR}/card/${SAMPLE}.rgi" --input_type contig --num_threads "${THREADS}"
rc=$?
set -e
if [[ $rc -eq 0 ]]; then
CARD_MODE="RGI"
else
log "RGI failed (likely CARD data not installed). Falling back to ABRicate card."
fi
fi
if [[ "${CARD_MODE}" == "ABRicate" ]]; then
log "Running ABRicate: CARD"
abricate --db card --minid "${MINID}" --mincov "${MINCOV}" "${ASM}" > "${OUTDIR}/raw/${SAMPLE}.card.tab"
fi
# ---- Build deduplicated tables ----
log "Creating deduplicated AMR/VFDB tables..."
export OUTDIR SAMPLE CARD_MODE
python - <<'PY'
import os, re
from pathlib import Path
import pandas as pd
from io import StringIO
outdir = Path(os.environ["OUTDIR"])
sample = os.environ["SAMPLE"]
card_mode = os.environ["CARD_MODE"]
def read_abricate_tab(path: Path, source: str) -> pd.DataFrame:
if not path.exists() or path.stat().st_size == 0:
return pd.DataFrame()
lines=[]
with path.open("r", errors="replace") as f:
for line in f:
if line.startswith("#") or not line.strip():
continue
lines.append(line)
if not lines:
return pd.DataFrame()
df = pd.read_csv(StringIO("".join(lines)), sep="\t", dtype=str)
df.insert(0, "Source", source)
return df
def to_num(s):
try:
return float(str(s).replace("%",""))
except:
return None
def normalize_abricate(df: pd.DataFrame, dbname: str) -> pd.DataFrame:
if df.empty:
return pd.DataFrame(columns=[
"Source","Database","Gene","Product","Accession","Contig","Start","End","Strand","Pct_Identity","Pct_Coverage"
])
# Column names vary slightly; handle common ones
gene = "GENE" if "GENE" in df.columns else None
prod = "PRODUCT" if "PRODUCT" in df.columns else None
acc = "ACCESSION" if "ACCESSION" in df.columns else None
contig = "SEQUENCE" if "SEQUENCE" in df.columns else ("CONTIG" if "CONTIG" in df.columns else None)
start = "START" if "START" in df.columns else None
end = "END" if "END" in df.columns else None
strand= "STRAND" if "STRAND" in df.columns else None
pid = "%IDENTITY" if "%IDENTITY" in df.columns else ("% Identity" if "% Identity" in df.columns else None)
pcv = "%COVERAGE" if "%COVERAGE" in df.columns else ("% Coverage" if "% Coverage" in df.columns else None)
out = pd.DataFrame()
out["Source"] = df["Source"]
out["Database"] = dbname
out["Gene"] = df[gene] if gene else ""
out["Product"] = df[prod] if prod else ""
out["Accession"] = df[acc] if acc else ""
out["Contig"] = df[contig] if contig else ""
out["Start"] = df[start] if start else ""
out["End"] = df[end] if end else ""
out["Strand"] = df[strand] if strand else ""
out["Pct_Identity"] = df[pid] if pid else ""
out["Pct_Coverage"] = df[pcv] if pcv else ""
return out
def dedup_best(df: pd.DataFrame, key_cols):
"""Keep best hit per key by highest identity, then coverage, then longest span."""
if df.empty:
return df
# numeric helpers
df = df.copy()
df["_pid"] = df["Pct_Identity"].map(to_num)
df["_pcv"] = df["Pct_Coverage"].map(to_num)
def span(row):
try:
return abs(int(row["End"]) - int(row["Start"])) + 1
except:
return 0
df["_span"] = df.apply(span, axis=1)
# sort best-first
df = df.sort_values(by=["_pid","_pcv","_span"], ascending=[False,False,False], na_position="last")
df = df.drop_duplicates(subset=key_cols, keep="first")
df = df.drop(columns=["_pid","_pcv","_span"])
return df
# ---------- AMR inputs ----------
amr_frames = []
# ResFinder (often 0 hits; still okay)
resfinder = outdir / "raw" / f"{sample}.resfinder.tab"
df = read_abricate_tab(resfinder, "ABRicate")
amr_frames.append(normalize_abricate(df, "ResFinder"))
# MEGARes
megares = outdir / "raw" / f"{sample}.megares.tab"
df = read_abricate_tab(megares, "ABRicate")
amr_frames.append(normalize_abricate(df, "MEGARes"))
# CARD: RGI or ABRicate
if card_mode == "RGI":
# Try common RGI tab outputs
prefix = outdir / "card" / f"{sample}.rgi"
rgi_tab = None
for ext in [".txt",".tab",".tsv"]:
p = Path(str(prefix) + ext)
if p.exists() and p.stat().st_size > 0:
rgi_tab = p
break
if rgi_tab is not None:
rgi = pd.read_csv(rgi_tab, sep="\t", dtype=str)
out = pd.DataFrame()
out["Source"] = "RGI"
out["Database"] = "CARD"
# Prefer ARO_name/Best_Hit_ARO if present
out["Gene"] = rgi["ARO_name"] if "ARO_name" in rgi.columns else (rgi["Best_Hit_ARO"] if "Best_Hit_ARO" in rgi.columns else "")
out["Product"] = rgi["ARO_name"] if "ARO_name" in rgi.columns else ""
out["Accession"] = rgi["ARO_accession"] if "ARO_accession" in rgi.columns else ""
out["Contig"] = rgi["Sequence"] if "Sequence" in rgi.columns else ""
out["Start"] = rgi["Start"] if "Start" in rgi.columns else ""
out["End"] = rgi["Stop"] if "Stop" in rgi.columns else (rgi["End"] if "End" in rgi.columns else "")
out["Strand"] = rgi["Orientation"] if "Orientation" in rgi.columns else ""
out["Pct_Identity"] = rgi["% Identity"] if "% Identity" in rgi.columns else ""
out["Pct_Coverage"] = rgi["% Coverage"] if "% Coverage" in rgi.columns else ""
amr_frames.append(out)
else:
card = outdir / "raw" / f"{sample}.card.tab"
df = read_abricate_tab(card, "ABRicate")
amr_frames.append(normalize_abricate(df, "CARD"))
amr_all = pd.concat([x for x in amr_frames if not x.empty], ignore_index=True) if any(not x.empty for x in amr_frames) else pd.DataFrame(
columns=["Source","Database","Gene","Product","Accession","Contig","Start","End","Strand","Pct_Identity","Pct_Coverage"]
)
# Deduplicate within each (Database,Gene) – this is usually what you want for manuscript tables
amr_dedup = dedup_best(amr_all, key_cols=["Database","Gene"])
# Sort nicely
if not amr_dedup.empty:
amr_dedup = amr_dedup.sort_values(["Database","Gene"]).reset_index(drop=True)
amr_out = outdir / "Table_AMR_genes_dedup.tsv"
amr_dedup.to_csv(amr_out, sep="\t", index=False)
# ---------- Virulence (VFDB) ----------
vfdb = outdir / "raw" / f"{sample}.vfdb.tab"
vf = read_abricate_tab(vfdb, "ABRicate")
vf_norm = normalize_abricate(vf, "VFDB")
# Dedup within (Gene) for VFDB (or use Database,Gene; Database constant)
vf_dedup = dedup_best(vf_norm, key_cols=["Gene"]) if not vf_norm.empty else vf_norm
if not vf_dedup.empty:
vf_dedup = vf_dedup.sort_values(["Gene"]).reset_index(drop=True)
vf_out = outdir / "Table_Virulence_VFDB_dedup.tsv"
vf_dedup.to_csv(vf_out, sep="\t", index=False)
print("OK wrote:")
print(" ", amr_out)
print(" ", vf_out)
PY
log "Done."
log "Outputs:"
log " ${OUTDIR}/Table_AMR_genes_dedup.tsv"
log " ${OUTDIR}/Table_Virulence_VFDB_dedup.tsv"
log "Raw:"
log " ${OUTDIR}/raw/${SAMPLE}.*.tab"
}
main
run_abricate_resistome_virulome_one_per_gene.sh
#!/usr/bin/env bash
set -Eeuo pipefail
# ------------------- USER SETTINGS -------------------
ENV_NAME="${ENV_NAME:-bengal3_ac3}"
ASM="${ASM:-GE11174.fasta}" # input assembly fasta
SAMPLE="${SAMPLE:-GE11174}"
OUTDIR="${OUTDIR:-resistome_virulence_${SAMPLE}}"
THREADS="${THREADS:-16}"
# ABRicate thresholds
# If you want your earlier "35 genes" behavior, use MINID=70 MINCOV=50.
# If you want stricter: e.g. MINID=80 MINCOV=70.
MINID="${MINID:-70}"
MINCOV="${MINCOV:-50}"
# -----------------------------------------------------
ts(){ date +"%F %T"; }
log(){ echo "[$(ts)] $*" >&2; }
on_err(){
local ec=$?
log "ERROR: failed (exit=${ec}) at line ${BASH_LINENO[0]}: ${BASH_COMMAND}"
exit $ec
}
trap on_err ERR
need_cmd(){ command -v "$1" >/dev/null 2>&1; }
activate_env() {
# shellcheck disable=SC1091
source "$(conda info --base)/etc/profile.d/conda.sh"
conda activate "${ENV_NAME}"
}
main(){
activate_env
log "Env: ${ENV_NAME}"
log "ASM: ${ASM}"
log "Sample: ${SAMPLE}"
log "Outdir: ${OUTDIR}"
log "Threads: ${THREADS}"
log "ABRicate thresholds: MINID=${MINID} MINCOV=${MINCOV}"
mkdir -p "${OUTDIR}"/{raw,logs}
# Save full log
LOGFILE="${OUTDIR}/logs/run_$(date +'%F_%H%M%S').log"
exec > >(tee -a "${LOGFILE}") 2>&1
log "Tool versions:"
abricate --version || true
abricate-get_db --help | head -n 5 || true
log "ABRicate DB list (selected):"
abricate --list | egrep -i "vfdb|resfinder|megares|card" || true
log "Indexing ABRicate databases (safe to re-run)..."
abricate --setupdb
# ---------------- Run ABRicate ----------------
log "Running ABRicate: MEGARes"
abricate --db megares --minid "${MINID}" --mincov "${MINCOV}" "${ASM}" > "${OUTDIR}/raw/${SAMPLE}.megares.tab"
log "Running ABRicate: CARD"
abricate --db card --minid "${MINID}" --mincov "${MINCOV}" "${ASM}" > "${OUTDIR}/raw/${SAMPLE}.card.tab"
log "Running ABRicate: ResFinder"
abricate --db resfinder --minid "${MINID}" --mincov "${MINCOV}" "${ASM}" > "${OUTDIR}/raw/${SAMPLE}.resfinder.tab"
log "Running ABRicate: VFDB"
abricate --db vfdb --minid "${MINID}" --mincov "${MINCOV}" "${ASM}" > "${OUTDIR}/raw/${SAMPLE}.vfdb.tab"
# --------------- Build tables -----------------
export OUTDIR SAMPLE
export MEGARES_TAB="${OUTDIR}/raw/${SAMPLE}.megares.tab"
export CARD_TAB="${OUTDIR}/raw/${SAMPLE}.card.tab"
export RESFINDER_TAB="${OUTDIR}/raw/${SAMPLE}.resfinder.tab"
export VFDB_TAB="${OUTDIR}/raw/${SAMPLE}.vfdb.tab"
export AMR_OUT="${OUTDIR}/Table_AMR_genes_one_per_gene.tsv"
export VIR_OUT="${OUTDIR}/Table_Virulence_VFDB_dedup.tsv"
export STATUS_OUT="${OUTDIR}/Table_DB_hit_counts.tsv"
log "Generating deduplicated tables..."
python - <<'PY'
import os
import pandas as pd
from pathlib import Path
megares_tab = Path(os.environ["MEGARES_TAB"])
card_tab = Path(os.environ["CARD_TAB"])
resfinder_tab = Path(os.environ["RESFINDER_TAB"])
vfdb_tab = Path(os.environ["VFDB_TAB"])
amr_out = Path(os.environ["AMR_OUT"])
vir_out = Path(os.environ["VIR_OUT"])
status_out = Path(os.environ["STATUS_OUT"])
def read_abricate(path: Path) -> pd.DataFrame:
"""Parse ABRicate .tab where header line starts with '#FILE'."""
if (not path.exists()) or path.stat().st_size == 0:
return pd.DataFrame()
header = None
rows = []
with path.open("r", errors="replace") as f:
for line in f:
if not line.strip():
continue
if line.startswith("#FILE"):
header = line.lstrip("#").rstrip("\n").split("\t")
continue
if line.startswith("#"):
continue
rows.append(line.rstrip("\n").split("\t"))
if header is None:
return pd.DataFrame()
if not rows:
return pd.DataFrame(columns=header)
return pd.DataFrame(rows, columns=header)
def normalize(df: pd.DataFrame, dbname: str) -> pd.DataFrame:
cols_out = ["Database","Gene","Product","Accession","Contig","Start","End","Strand","Pct_Identity","Pct_Coverage"]
if df is None or df.empty:
return pd.DataFrame(columns=cols_out)
out = pd.DataFrame({
"Database": dbname,
"Gene": df.get("GENE",""),
"Product": df.get("PRODUCT",""),
"Accession": df.get("ACCESSION",""),
"Contig": df.get("SEQUENCE",""),
"Start": df.get("START",""),
"End": df.get("END",""),
"Strand": df.get("STRAND",""),
"Pct_Identity": pd.to_numeric(df.get("%IDENTITY",""), errors="coerce"),
"Pct_Coverage": pd.to_numeric(df.get("%COVERAGE",""), errors="coerce"),
})
return out[cols_out]
def best_hit_dedup(df: pd.DataFrame, key_cols):
"""Keep best hit by highest identity, then coverage, then alignment length."""
if df.empty:
return df
d = df.copy()
d["Start_i"] = pd.to_numeric(d["Start"], errors="coerce").fillna(0).astype(int)
d["End_i"] = pd.to_numeric(d["End"], errors="coerce").fillna(0).astype(int)
d["Len"] = (d["End_i"] - d["Start_i"]).abs() + 1
d = d.sort_values(["Pct_Identity","Pct_Coverage","Len"], ascending=[False,False,False])
d = d.drop_duplicates(subset=key_cols, keep="first")
return d.drop(columns=["Start_i","End_i","Len"])
def count_hits(path: Path) -> int:
if not path.exists():
return 0
n = 0
with path.open() as f:
for line in f:
if line.startswith("#") or not line.strip():
continue
n += 1
return n
# -------- load + normalize --------
parts = []
for dbname, p in [("MEGARes", megares_tab), ("CARD", card_tab), ("ResFinder", resfinder_tab)]:
df = read_abricate(p)
parts.append(normalize(df, dbname))
amr_all = pd.concat([x for x in parts if not x.empty], ignore_index=True) if any(not x.empty for x in parts) else pd.DataFrame(
columns=["Database","Gene","Product","Accession","Contig","Start","End","Strand","Pct_Identity","Pct_Coverage"]
)
# remove empty genes
amr_all = amr_all[amr_all["Gene"].astype(str).str.len() > 0].copy()
# best per (Database,Gene)
amr_db_gene = best_hit_dedup(amr_all, ["Database","Gene"]) if not amr_all.empty else amr_all
# one row per Gene overall, priority: CARD > ResFinder > MEGARes
priority = {"CARD": 0, "ResFinder": 1, "MEGARes": 2}
if not amr_db_gene.empty:
amr_db_gene["prio"] = amr_db_gene["Database"].map(priority).fillna(9).astype(int)
amr_one = amr_db_gene.sort_values(
["Gene","prio","Pct_Identity","Pct_Coverage"],
ascending=[True, True, False, False]
)
amr_one = amr_one.drop_duplicates(["Gene"], keep="first").drop(columns=["prio"])
amr_one = amr_one.sort_values(["Gene"]).reset_index(drop=True)
else:
amr_one = amr_db_gene
amr_out.parent.mkdir(parents=True, exist_ok=True)
amr_one.to_csv(amr_out, sep="\t", index=False)
# -------- VFDB --------
vf = normalize(read_abricate(vfdb_tab), "VFDB")
vf = vf[vf["Gene"].astype(str).str.len() > 0].copy()
vf_one = best_hit_dedup(vf, ["Gene"]) if not vf.empty else vf
if not vf_one.empty:
vf_one = vf_one.sort_values(["Gene"]).reset_index(drop=True)
vir_out.parent.mkdir(parents=True, exist_ok=True)
vf_one.to_csv(vir_out, sep="\t", index=False)
# -------- status counts --------
status = pd.DataFrame([
{"Database":"MEGARes", "Hit_lines": count_hits(megares_tab), "File": str(megares_tab)},
{"Database":"CARD", "Hit_lines": count_hits(card_tab), "File": str(card_tab)},
{"Database":"ResFinder", "Hit_lines": count_hits(resfinder_tab), "File": str(resfinder_tab)},
{"Database":"VFDB", "Hit_lines": count_hits(vfdb_tab), "File": str(vfdb_tab)},
])
status_out.parent.mkdir(parents=True, exist_ok=True)
status.to_csv(status_out, sep="\t", index=False)
print("OK wrote:")
print(" ", amr_out, "rows=", len(amr_one))
print(" ", vir_out, "rows=", len(vf_one))
print(" ", status_out)
PY
log "Finished."
log "Main outputs:"
log " ${AMR_OUT}"
log " ${VIR_OUT}"
log " ${STATUS_OUT}"
log "Raw ABRicate outputs:"
log " ${OUTDIR}/raw/${SAMPLE}.megares.tab"
log " ${OUTDIR}/raw/${SAMPLE}.card.tab"
log " ${OUTDIR}/raw/${SAMPLE}.resfinder.tab"
log " ${OUTDIR}/raw/${SAMPLE}.vfdb.tab"
log "Log:"
log " ${LOGFILE}"
}
main
resolve_best_assemblies_entrez.py
#!/usr/bin/env python3
import csv
import os
import re
import sys
import time
from dataclasses import dataclass
from typing import List, Optional, Tuple
from Bio import Entrez
# REQUIRED by NCBI policy
Entrez.email = os.environ.get("NCBI_EMAIL", "your.email@example.com")
# Be nice to NCBI
ENTREZ_DELAY_SEC = float(os.environ.get("ENTREZ_DELAY_SEC", "0.34"))
LEVEL_RANK = {
"Complete Genome": 0,
"Chromosome": 1,
"Scaffold": 2,
"Contig": 3,
# sometimes NCBI uses slightly different strings:
"complete genome": 0,
"chromosome": 1,
"scaffold": 2,
"contig": 3,
}
def level_rank(level: str) -> int:
return LEVEL_RANK.get(level.strip(), 99)
def is_refseq(accession: str) -> bool:
return accession.startswith("GCF_")
@dataclass
class AssemblyHit:
assembly_uid: str
assembly_accession: str # GCF_... or GCA_...
organism: str
strain: str
assembly_level: str
refseq_category: str
submitter: str
ftp_path: str
def entrez_search_assembly(term: str, retmax: int = 50) -> List[str]:
"""Return Assembly UIDs matching term."""
h = Entrez.esearch(db="assembly", term=term, retmax=str(retmax))
rec = Entrez.read(h)
h.close()
time.sleep(ENTREZ_DELAY_SEC)
return rec.get("IdList", [])
def entrez_esummary_assembly(uids: List[str]) -> List[AssemblyHit]:
"""Fetch assembly summary records for given UIDs."""
if not uids:
return []
h = Entrez.esummary(db="assembly", id=",".join(uids), report="full")
rec = Entrez.read(h)
h.close()
time.sleep(ENTREZ_DELAY_SEC)
hits: List[AssemblyHit] = []
docs = rec.get("DocumentSummarySet", {}).get("DocumentSummary", [])
for d in docs:
# Some fields can be missing
acc = str(d.get("AssemblyAccession", "")).strip()
org = str(d.get("Organism", "")).strip()
level = str(d.get("AssemblyStatus", "")).strip() or str(d.get("AssemblyLevel", "")).strip()
# NCBI uses "AssemblyStatus" sometimes, "AssemblyLevel" other times;
# in practice AssemblyStatus often equals "Complete Genome"/"Chromosome"/...
if not level:
level = str(d.get("AssemblyLevel", "")).strip()
strain = str(d.get("Biosample", "")).strip()
# Strain is not always in a clean field. Try "Sub_value" in Meta, or parse Submitter/Title.
# We'll try a few common places:
title = str(d.get("AssemblyName", "")).strip()
submitter = str(d.get("SubmitterOrganization", "")).strip()
refcat = str(d.get("RefSeq_category", "")).strip()
ftp = str(d.get("FtpPath_RefSeq", "")).strip() or str(d.get("FtpPath_GenBank", "")).strip()
hits.append(
AssemblyHit(
assembly_uid=str(d.get("Uid", "")),
assembly_accession=acc,
organism=org,
strain=strain,
assembly_level=level,
refseq_category=refcat,
submitter=submitter,
ftp_path=ftp,
)
)
return hits
def best_hit(hits: List[AssemblyHit]) -> Optional[AssemblyHit]:
"""Pick best hit by level (Complete>Chromosome>...), prefer RefSeq, then prefer representative/reference."""
if not hits:
return None
def key(h: AssemblyHit) -> Tuple[int, int, int, str]:
# lower is better
lvl = level_rank(h.assembly_level)
ref = 0 if is_refseq(h.assembly_accession) else 1
# prefer reference/representative if present
cat = (h.refseq_category or "").lower()
rep = 0
if "reference" in cat:
rep = 0
elif "representative" in cat:
rep = 1
else:
rep = 2
# tie-breaker: accession string (stable)
return (lvl, ref, rep, h.assembly_accession)
return sorted(hits, key=key)[0]
def relaxed_fallback_terms(organism: str, strain_tokens: List[str]) -> List[str]:
"""
Build fallback search terms:
1) organism + strain tokens
2) organism only (species-only)
3) genus-only (if species fails)
"""
terms = []
# 1) Full term: organism + strain tokens
if strain_tokens:
t = f'"{organism}"[Organism] AND (' + " OR ".join(f'"{s}"[All Fields]' for s in strain_tokens) + ")"
terms.append(t)
# 2) Species only
terms.append(f'"{organism}"[Organism]')
# 3) Genus only
genus = organism.split()[0]
terms.append(f'"{genus}"[Organism]')
return terms
def resolve_one(label: str, organism: str, strain_tokens: List[str], retmax: int = 80) -> Tuple[str, Optional[AssemblyHit], str]:
"""
Returns:
- selected accession or "NA"
- selected hit (optional)
- which query term matched
"""
for term in relaxed_fallback_terms(organism, strain_tokens):
uids = entrez_search_assembly(term, retmax=retmax)
hits = entrez_esummary_assembly(uids)
chosen = best_hit(hits)
if chosen and chosen.assembly_accession:
return chosen.assembly_accession, chosen, term
return "NA", None, ""
def parse_targets_tsv(path: str) -> List[Tuple[str, str, List[str]]]:
"""
Input TSV format:
label organism strain_tokens
where strain_tokens is a semicolon-separated list, e.g. "FRB97;FRB 97"
"""
rows = []
with open(path, newline="") as f:
r = csv.DictReader(f, delimiter="\t")
for row in r:
label = row["label"].strip()
org = row["organism"].strip()
tokens = [x.strip() for x in row.get("strain_tokens", "").split(";") if x.strip()]
rows.append((label, org, tokens))
return rows
def main():
if len(sys.argv) < 3:
print("Usage: resolve_best_assemblies_entrez.py targets.tsv out.tsv", file=sys.stderr)
sys.exit(2)
targets_tsv = sys.argv[1]
out_tsv = sys.argv[2]
targets = parse_targets_tsv(targets_tsv)
with open(out_tsv, "w", newline="") as f:
w = csv.writer(f, delimiter="\t")
w.writerow(["label", "best_accession", "assembly_level", "refseq_category", "organism", "query_used"])
for label, org, tokens in targets:
acc, hit, term = resolve_one(label, org, tokens)
if hit:
w.writerow([label, acc, hit.assembly_level, hit.refseq_category, hit.organism, term])
print(f"[OK] {label} -> {acc} ({hit.assembly_level})")
else:
w.writerow([label, "NA", "", "", org, ""])
print(f"[WARN] {label} -> NA (no assemblies found)")
if __name__ == "__main__":
main()
build_wgs_tree_fig3B.sh
#!/usr/bin/env bash
set -euo pipefail
###############################################################################
# Core-genome phylogeny pipeline (genome-wide; no 16S/MLST):
#
# Uses existing conda env prefix:
# ENV_NAME=/home/jhuang/miniconda3/envs/bengal3_ac3
#
# Inputs:
# - resolved_accessions.tsv
# - REF.fasta
#
# Also consider these 4 accessions (duplicates removed):
# GCF_002291425.1, GCF_047901425.1, GCF_004342245.1, GCA_032062225.1
#
# Robustness:
# - Conda activation hook may reference JAVA_HOME under set -u (handled)
# - GFF validation ignores the ##FASTA FASTA block (valid GFF3)
# - FIXED: No more double Roary directories (script no longer pre-creates -f dir)
# Logs go to WORKDIR/logs and are also copied into the final Roary dir.
#
# Outputs:
# ${WORKDIR}/plot/core_tree.pdf
# ${WORKDIR}/plot/core_tree.png
###############################################################################
THREADS="${THREADS:-8}"
WORKDIR="${WORKDIR:-work_wgs_tree}"
RESOLVED_TSV="${RESOLVED_TSV:-resolved_accessions.tsv}"
REF_FASTA="${REF_FASTA:-AN6.fasta}"
ENV_NAME="${ENV_NAME:-/home/jhuang/miniconda3/envs/bengal3_ac3}"
EXTRA_ASSEMBLIES=(
#"GCF_002291425.1"
#"GCF_047901425.1"
#"GCF_004342245.1"
#"GCA_032062225.1"
)
CLUSTERS_K="${CLUSTERS_K:-6}"
MODE="${1:-all}"
log(){ echo "[$(date +'%F %T')] $*" >&2; }
need_cmd(){ command -v "$1" >/dev/null 2>&1; }
activate_existing_env(){
if [[ ! -d "${ENV_NAME}" ]]; then
log "ERROR: ENV_NAME path does not exist: ${ENV_NAME}"
exit 1
fi
conda_base="$(dirname "$(dirname "${ENV_NAME}")")"
if [[ -f "${conda_base}/etc/profile.d/conda.sh" ]]; then
# shellcheck disable=SC1091
source "${conda_base}/etc/profile.d/conda.sh"
else
if need_cmd conda; then
# shellcheck disable=SC1091
source "$(conda info --base)/etc/profile.d/conda.sh"
else
log "ERROR: cannot find conda.sh and conda is not on PATH."
exit 1
fi
fi
# Avoid "unbound variable" in activation hooks under set -u
export JAVA_HOME="${JAVA_HOME:-}"
log "Activating env: ${ENV_NAME}"
set +u
conda activate "${ENV_NAME}"
set -u
}
check_dependencies() {
# ---- plot-only mode: only need R (and optionally python) ----
if [[ "${MODE}" == "plot-only" ]]; then
local missing=()
command -v Rscript >/dev/null 2>&1 || missing+=("Rscript")
command -v python >/dev/null 2>&1 || missing+=("python")
if (( ${#missing[@]} )); then
log "ERROR: Missing required tools for plot-only in env: ${ENV_NAME}"
printf ' - %s\n' "${missing[@]}" >&2
exit 1
fi
# Check required R packages (fail early with clear message)
Rscript -e 'pkgs <- c("ggtree","ggplot2","aplot");
miss <- pkgs[!sapply(pkgs, requireNamespace, quietly=TRUE)];
if(length(miss)) stop("Missing R packages: ", paste(miss, collapse=", "))'
return 0
fi
# ------------------------------------------------------------
# existing full-pipeline checks continue below...
# (your current prokka/roary/raxml-ng checks stay as-is)
#...
}
prepare_accessions(){
[[ -s "${RESOLVED_TSV}" ]] || { log "ERROR: missing ${RESOLVED_TSV}"; exit 1; }
mkdir -p "${WORKDIR}/meta"
printf "%s\n" "${EXTRA_ASSEMBLIES[@]}" > "${WORKDIR}/meta/extras.txt"
WORKDIR="${WORKDIR}" RESOLVED_TSV="${RESOLVED_TSV}" python - << 'PY'
import os
import pandas as pd
import pathlib
workdir = pathlib.Path(os.environ.get("WORKDIR", "work_wgs_tree"))
resolved_tsv = os.environ.get("RESOLVED_TSV", "resolved_accessions.tsv")
df = pd.read_csv(resolved_tsv, sep="\t")
# Expect columns like: label, best_accession (but be tolerant)
if "best_accession" not in df.columns:
df = df.rename(columns={df.columns[1]:"best_accession"})
if "label" not in df.columns:
df = df.rename(columns={df.columns[0]:"label"})
df = df[["label","best_accession"]].dropna()
df = df[df["best_accession"]!="NA"].copy()
extras_path = workdir/"meta/extras.txt"
extras = [x.strip() for x in extras_path.read_text().splitlines() if x.strip()]
extra_df = pd.DataFrame({"label":[f"EXTRA_{a}" for a in extras], "best_accession": extras})
all_df = pd.concat([df, extra_df], ignore_index=True)
all_df = all_df.drop_duplicates(subset=["best_accession"], keep="first").reset_index(drop=True)
out = workdir/"meta/accessions.tsv"
out.parent.mkdir(parents=True, exist_ok=True)
all_df.to_csv(out, sep="\t", index=False)
print("Final unique genomes:", len(all_df))
print(all_df)
print("Wrote:", out)
PY
}
download_genomes(){
mkdir -p "${WORKDIR}/genomes_ncbi"
while IFS=$'\t' read -r label acc; do
[[ "$label" == "label" ]] && continue
[[ -z "${acc}" ]] && continue
outdir="${WORKDIR}/genomes_ncbi/${acc}"
if [[ -d "${outdir}" ]]; then
log "Found ${acc}, skipping download"
continue
fi
log "Downloading ${acc}..."
datasets download genome accession "${acc}" --include genome --filename "${WORKDIR}/genomes_ncbi/${acc}.zip"
unzip -q "${WORKDIR}/genomes_ncbi/${acc}.zip" -d "${outdir}"
rm -f "${WORKDIR}/genomes_ncbi/${acc}.zip"
done < "${WORKDIR}/meta/accessions.tsv"
}
collect_fastas(){
mkdir -p "${WORKDIR}/fastas"
while IFS=$'\t' read -r label acc; do
[[ "$label" == "label" ]] && continue
[[ -z "${acc}" ]] && continue
fna="$(find "${WORKDIR}/genomes_ncbi/${acc}" -type f -name "*.fna" | head -n 1 || true)"
[[ -n "${fna}" ]] || { log "ERROR: .fna not found for ${acc}"; exit 1; }
cp -f "${fna}" "${WORKDIR}/fastas/${acc}.fna"
done < "${WORKDIR}/meta/accessions.tsv"
[[ -s "${REF_FASTA}" ]] || { log "ERROR: missing ${REF_FASTA}"; exit 1; }
cp -f "${REF_FASTA}" "${WORKDIR}/fastas/REF.fna"
}
run_prokka(){
mkdir -p "${WORKDIR}/prokka" "${WORKDIR}/gffs"
for fna in "${WORKDIR}/fastas/"*.fna; do
base="$(basename "${fna}" .fna)"
outdir="${WORKDIR}/prokka/${base}"
gffout="${WORKDIR}/gffs/${base}.gff"
if [[ -s "${gffout}" ]]; then
log "GFF exists for ${base}, skipping Prokka"
continue
fi
log "Prokka annotating ${base}..."
prokka --outdir "${outdir}" --prefix "${base}" --cpus "${THREADS}" --force "${fna}"
cp -f "${outdir}/${base}.gff" "${gffout}"
done
}
sanitize_and_check_gffs(){
log "Sanity checking GFFs (ignoring ##FASTA section)..."
for gff in "${WORKDIR}/gffs/"*.gff; do
if file "$gff" | grep -qi "CRLF"; then
log "Fixing CRLF -> LF in $(basename "$gff")"
sed -i 's/\r$//' "$gff"
fi
bad=$(awk '
BEGIN{bad=0; in_fasta=0}
/^##FASTA/{in_fasta=1; next}
in_fasta==1{next}
/^#/{next}
NF==0{next}
{
if (split($0,a,"\t")!=9) {bad=1}
}
END{print bad}
' "$gff")
if [[ "$bad" == "1" ]]; then
log "ERROR: GFF feature section not 9-column tab-delimited: $gff"
log "First 5 problematic feature lines (before ##FASTA):"
awk '
BEGIN{in_fasta=0; c=0}
/^##FASTA/{in_fasta=1; next}
in_fasta==1{next}
/^#/{next}
NF==0{next}
{
if (split($0,a,"\t")!=9) {
print
c++
if (c==5) exit
}
}
' "$gff" || true
exit 1
fi
done
}
run_roary(){
mkdir -p "${WORKDIR}/meta" "${WORKDIR}/logs"
ts="$(date +%s)"
run_id="${ts}_$$"
ROARY_OUT="${WORKDIR}/roary_${run_id}"
ROARY_STDOUT="${WORKDIR}/logs/roary_${run_id}.stdout.txt"
ROARY_STDERR="${WORKDIR}/logs/roary_${run_id}.stderr.txt"
MARKER="${WORKDIR}/meta/roary_${run_id}.start"
: > "${MARKER}"
log "Running Roary (outdir: ${ROARY_OUT})"
log "Roary logs:"
log " STDOUT: ${ROARY_STDOUT}"
log " STDERR: ${ROARY_STDERR}"
set +e
roary -e --mafft -p "${THREADS}" -cd 95 -i 95 \
-f "${ROARY_OUT}" "${WORKDIR}/gffs/"*.gff \
> "${ROARY_STDOUT}" 2> "${ROARY_STDERR}"
rc=$?
set -e
if [[ "${rc}" -ne 0 ]]; then
log "WARNING: Roary exited non-zero (rc=${rc}). Will check if core alignment was produced anyway."
fi
CORE_ALN="$(find "${WORKDIR}" -maxdepth 2 -type f -name "core_gene_alignment.aln" -newer "${MARKER}" -printf '%T@ %p\n' 2>/dev/null \
| sort -nr | head -n 1 | cut -d' ' -f2- || true)"
if [[ -z "${CORE_ALN}" || ! -s "${CORE_ALN}" ]]; then
log "ERROR: Could not find core_gene_alignment.aln produced by this Roary run under ${WORKDIR}"
log "---- STDERR (head) ----"
head -n 120 "${ROARY_STDERR}" 2>/dev/null || true
log "---- STDERR (tail) ----"
tail -n 120 "${ROARY_STDERR}" 2>/dev/null || true
exit 1
fi
CORE_DIR="$(dirname "${CORE_ALN}")"
cp -f "${ROARY_STDOUT}" "${CORE_DIR}/roary.stdout.txt" || true
cp -f "${ROARY_STDERR}" "${CORE_DIR}/roary.stderr.txt" || true
# >>> IMPORTANT FIX: store ABSOLUTE path <<<
CORE_ALN_ABS="$(readlink -f "${CORE_ALN}")"
log "Using core alignment: ${CORE_ALN_ABS}"
echo "${CORE_ALN_ABS}" > "${WORKDIR}/meta/core_alignment_path.txt"
echo "$(readlink -f "${CORE_DIR}")" > "${WORKDIR}/meta/roary_output_dir.txt"
}
run_raxmlng(){
mkdir -p "${WORKDIR}/raxmlng"
CORE_ALN="$(cat "${WORKDIR}/meta/core_alignment_path.txt")"
[[ -s "${CORE_ALN}" ]] || { log "ERROR: core alignment not found or empty: ${CORE_ALN}"; exit 1; }
log "Running RAxML-NG..."
raxml-ng --all \
--msa "${CORE_ALN}" \
--model GTR+G \
--bs-trees 1000 \
--threads "${THREADS}" \
--prefix "${WORKDIR}/raxmlng/core"
}
ensure_r_pkgs(){
Rscript - <<'RS'
need <- c("ape","ggplot2","dplyr","readr","aplot","ggtree")
missing <- need[!vapply(need, requireNamespace, logical(1), quietly=TRUE)]
if (length(missing)) {
message("Missing R packages: ", paste(missing, collapse=", "))
message("Try:")
message(" conda install -c conda-forge -c bioconda r-aplot bioconductor-ggtree r-ape r-ggplot2 r-dplyr r-readr")
quit(status=1)
}
RS
}
plot_tree(){
mkdir -p "${WORKDIR}/plot"
WORKDIR="${WORKDIR}" python - << 'PY'
import os
import pandas as pd
import pathlib
workdir = pathlib.Path(os.environ.get("WORKDIR", "work_wgs_tree"))
acc = pd.read_csv(workdir/"meta/accessions.tsv", sep="\t")
g = (acc.groupby("best_accession")["label"]
.apply(lambda x: "; ".join(sorted(set(map(str, x)))))
.reset_index())
g["display"] = g.apply(lambda r: f'{r["label"]} ({r["best_accession"]})', axis=1)
labels = g.rename(columns={"best_accession":"sample"})[["sample","display"]]
# Add REF
labels = pd.concat([labels, pd.DataFrame([{"sample":"REF","display":"REF"}])], ignore_index=True)
out = workdir/"plot/labels.tsv"
out.parent.mkdir(parents=True, exist_ok=True)
labels.to_csv(out, sep="\t", index=False)
print("Wrote:", out)
PY
cat > "${WORKDIR}/plot/plot_tree.R" << 'RS'
suppressPackageStartupMessages({
library(ape); library(ggplot2); library(ggtree); library(dplyr); library(readr)
})
args <- commandArgs(trailingOnly=TRUE)
tree_in <- args[1]; labels_tsv <- args[2]; k <- as.integer(args[3])
out_pdf <- args[4]; out_png <- args[5]
tr <- read.tree(tree_in)
lab <- read_tsv(labels_tsv, show_col_types=FALSE)
tipmap <- setNames(lab$display, lab$sample)
tr$tip.label <- ifelse(tr$tip.label %in% names(tipmap), tipmap[tr$tip.label], tr$tip.label)
hc <- as.hclust.phylo(tr)
grp <- cutree(hc, k=k)
grp_df <- tibble(tip=names(grp), clade=paste0("Clade_", grp))
p <- ggtree(tr, layout="rectangular") %<+% grp_df +
aes(color=clade) +
geom_tree(linewidth=0.9) +
geom_tippoint(aes(color=clade), size=2.3) +
geom_tiplab(aes(color=clade), size=3.1, align=TRUE,
linetype="dotted", linesize=0.35, offset=0.02) +
theme_tree2() +
theme(legend.position="right", legend.title=element_blank(),
plot.margin=margin(8,18,8,8))
# + geom_treescale(x=0, y=0, width=0.01, fontsize=3)
# ---- Manual scale bar (fixed label "0.01") ----
scale_x <- 0
scale_y <- 0
scale_w <- 0.01
p <- p +
annotate("segment",
x = scale_x, xend = scale_x + scale_w,
y = scale_y, yend = scale_y,
linewidth = 0.6) +
annotate("text",
x = scale_x + scale_w/2,
y = scale_y - 0.6,
label = "0.01",
size = 3)
# ----------------------------------------------
ggsave(out_pdf, p, width=9, height=6.5, device="pdf")
ggsave(out_png, p, width=9, height=6.5, dpi=300)
RS
Rscript "${WORKDIR}/plot/plot_tree.R" \
"${WORKDIR}/raxmlng/core.raxml.support" \
"${WORKDIR}/plot/labels.tsv" \
"${CLUSTERS_K}" \
"${WORKDIR}/plot/core_tree.pdf" \
"${WORKDIR}/plot/core_tree.png"
log "Plot written:"
log " ${WORKDIR}/plot/core_tree.pdf"
log " ${WORKDIR}/plot/core_tree.png"
}
main(){
mkdir -p "${WORKDIR}"
activate_existing_env
check_dependencies
if [[ "${MODE}" == "plot-only" ]]; then
log "Running plot-only mode"
plot_tree
log "DONE."
exit 0
fi
log "1) Prepare unique accessions"
prepare_accessions
log "2) Download genomes"
download_genomes
log "3) Collect FASTAs (+ REF)"
collect_fastas
log "4) Prokka"
run_prokka
log "4b) GFF sanity check"
sanitize_and_check_gffs
log "5) Roary"
run_roary
log "6) RAxML-NG"
run_raxmlng
#log "6b) Check R packages"
#ensure_r_pkgs
log "7) Plot"
plot_tree
log "DONE."
}
main "$@"
regenerate_labels.sh
python - <<'PY'
import json, re
from pathlib import Path
import pandas as pd
WORKDIR = Path("work_wgs_tree")
ACC_TSV = WORKDIR / "meta/accessions.tsv"
GENOMES_DIR = WORKDIR / "genomes_ncbi"
OUT = WORKDIR / "plot/labels.tsv"
def first_existing(paths):
for p in paths:
if p and Path(p).exists():
return Path(p)
return None
def find_metadata_files(acc_dir: Path):
# NCBI Datasets layouts vary by version; search broadly
candidates = []
for pat in [
"**/assembly_data_report.jsonl",
"**/data_report.jsonl",
"**/dataset_catalog.json",
"**/*assembly_report*.txt",
"**/*assembly_report*.tsv",
]:
candidates += list(acc_dir.glob(pat))
# de-dup, keep stable order
seen = set()
uniq = []
for p in candidates:
if p.as_posix() not in seen:
uniq.append(p)
seen.add(p.as_posix())
return uniq
def parse_jsonl_for_name_and_strain(p: Path):
# assembly_data_report.jsonl / data_report.jsonl: first JSON object usually has organism info
try:
with p.open() as f:
for line in f:
line = line.strip()
if not line:
continue
obj = json.loads(line)
# Try common fields
# organismName may appear as:
# obj["organism"]["organismName"] or obj["organismName"]
org = None
strain = None
if isinstance(obj, dict):
if "organism" in obj and isinstance(obj["organism"], dict):
org = obj["organism"].get("organismName") or obj["organism"].get("taxName")
# isolate/strain can hide in infraspecificNames or isolate/strain keys
infra = obj["organism"].get("infraspecificNames") or {}
if isinstance(infra, dict):
strain = infra.get("strain") or infra.get("isolate")
strain = strain or obj["organism"].get("strain") or obj["organism"].get("isolate")
org = org or obj.get("organismName") or obj.get("taxName")
# Sometimes isolate/strain is nested elsewhere
if not strain:
# assemblyInfo / assembly / sampleInfo patterns
for key in ["assemblyInfo", "assembly", "sampleInfo", "biosample"]:
if key in obj and isinstance(obj[key], dict):
d = obj[key]
strain = strain or d.get("strain") or d.get("isolate")
infra = d.get("infraspecificNames")
if isinstance(infra, dict):
strain = strain or infra.get("strain") or infra.get("isolate")
if org:
return org, strain
except Exception:
pass
return None, None
def parse_dataset_catalog(p: Path):
# dataset_catalog.json can include assembly/organism info, but structure varies.
try:
obj = json.loads(p.read_text())
except Exception:
return None, None
org = None
strain = None
# walk dict recursively looking for likely keys
def walk(x):
nonlocal org, strain
if isinstance(x, dict):
# organism keys
if not org:
if "organismName" in x and isinstance(x["organismName"], str):
org = x["organismName"]
elif "taxName" in x and isinstance(x["taxName"], str):
org = x["taxName"]
# strain/isolate keys
if not strain:
for k in ["strain", "isolate"]:
if k in x and isinstance(x[k], str) and x[k].strip():
strain = x[k].strip()
break
for v in x.values():
walk(v)
elif isinstance(x, list):
for v in x:
walk(v)
walk(obj)
return org, strain
def parse_assembly_report_txt(p: Path):
# NCBI assembly_report.txt often has lines like: "# Organism name:" and "# Infraspecific name:"
org = None
strain = None
try:
for line in p.read_text(errors="ignore").splitlines():
if line.startswith("# Organism name:"):
org = line.split(":", 1)[1].strip()
elif line.startswith("# Infraspecific name:"):
val = line.split(":", 1)[1].strip()
# e.g. "strain=XXXX" or "isolate=YYYY"
m = re.search(r"(strain|isolate)\s*=\s*(.+)", val)
if m:
strain = m.group(2).strip()
if org and strain:
break
except Exception:
pass
return org, strain
def best_name_for_accession(acc: str):
acc_dir = GENOMES_DIR / acc
if not acc_dir.exists():
return None
files = find_metadata_files(acc_dir)
org = None
strain = None
# Prefer JSONL reports first
for p in files:
if p.name.endswith(".jsonl"):
org, strain = parse_jsonl_for_name_and_strain(p)
if org:
break
# Next try dataset_catalog.json
if not org:
for p in files:
if p.name == "dataset_catalog.json":
org, strain = parse_dataset_catalog(p)
if org:
break
# Finally try assembly report text
if not org:
for p in files:
if "assembly_report" in p.name and p.suffix in [".txt", ".tsv"]:
org, strain = parse_assembly_report_txt(p)
if org:
break
if not org:
return None
# normalize whitespace
org = re.sub(r"\s+", " ", org).strip()
if strain:
strain = re.sub(r"\s+", " ", str(strain)).strip()
# avoid duplicating if strain already in organism string
if strain and strain.lower() not in org.lower():
return f"{org} {strain}"
return org
# --- build labels ---
acc = pd.read_csv(ACC_TSV, sep="\t")
if "label" not in acc.columns or "best_accession" not in acc.columns:
raise SystemExit("accessions.tsv must have columns: label, best_accession")
rows = []
for _, r in acc.iterrows():
label = str(r["label"])
accn = str(r["best_accession"])
if label.startswith("EXTRA_"):
nm = best_name_for_accession(accn)
if nm:
label = nm
else:
# fallback: keep previous behavior if metadata not found
label = label.replace("EXTRA_", "EXTRA ")
display = f"{label} ({accn})"
rows.append({"sample": accn, "display": display})
# Add GE11174 exactly as-is
rows.append({"sample": "GE11174", "display": "GE11174"})
out_df = pd.DataFrame(rows).drop_duplicates(subset=["sample"], keep="first")
OUT.parent.mkdir(parents=True, exist_ok=True)
out_df.to_csv(OUT, sep="\t", index=False)
print("Wrote:", OUT)
print(out_df)
PY
plot_tree_v4.R
suppressPackageStartupMessages({
library(ape)
library(readr)
})
args <- commandArgs(trailingOnly = TRUE)
tree_in <- args[1]
labels_tsv <- args[2]
# args[3] is k (ignored here since all-black)
out_pdf <- args[4]
out_png <- args[5]
# --- Load tree ---
tr <- read.tree(tree_in)
# --- Root on outgroup (Brenneria nigrifluens) by accession ---
outgroup_id <- "GCF_005484965.1"
if (outgroup_id %in% tr$tip.label) {
tr <- root(tr, outgroup = outgroup_id, resolve.root = TRUE)
} else {
warning("Outgroup tip not found in tree: ", outgroup_id, " (tree will remain unrooted)")
}
# Make plotting order nicer
tr <- ladderize(tr, right = FALSE)
# --- Load labels (columns: sample, display) ---
lab <- read_tsv(labels_tsv, show_col_types = FALSE)
if (!all(c("sample","display") %in% colnames(lab))) {
stop("labels.tsv must contain columns: sample, display")
}
# Map tip labels AFTER rooting (rooting uses accession IDs)
tipmap <- setNames(lab$display, lab$sample)
tr$tip.label <- ifelse(tr$tip.label %in% names(tipmap),
unname(tipmap[tr$tip.label]),
tr$tip.label)
# --- Plot helper ---
plot_one <- function(device_fun) {
device_fun()
op <- par(no.readonly = TRUE)
on.exit(par(op), add = TRUE)
# Bigger right margin for long labels; tighter overall
par(mar = c(4, 2, 2, 18), xpd = NA)
# Compute xlim with padding so labels fit but whitespace is limited
xx <- node.depth.edgelength(tr)
xmax <- max(xx)
xpad <- 0.10 * xmax
plot(tr,
type = "phylogram",
use.edge.length = TRUE,
show.tip.label = TRUE,
edge.color = "black",
tip.color = "black",
cex = 0.9, # smaller text -> less overlap
label.offset = 0.003, # small gap after tip
no.margin = FALSE,
x.lim = c(0, xmax + xpad))
# Add a clear scale bar near bottom-left
# Use a fixed fraction of tree length for bar length
bar_len <- 0.05 * xmax
add.scale.bar(x = 0, y = 0, length = 0.01, lwd = 2, cex = 0.9)
}
# --- Write outputs (shorter height -> less vertical whitespace) ---
plot_one(function() pdf(out_pdf, width = 11, height = 6, useDingbats = FALSE))
dev.off()
plot_one(function() png(out_png, width = 3000, height = 1000, res = 300))
dev.off()
cat("Wrote:\n", out_pdf, "\n", out_png, "\n", sep = "")
run_fastani_batch_verbose.sh
#!/usr/bin/env bash
set -euo pipefail
# ============ CONFIG ============
QUERY="bacass_out/Prokka/An6/An6.fna" # 你的 query fasta
ACC_LIST="accessions.txt" # 每行一个 GCF/GCA
OUTDIR="fastani_batch"
THREADS=8
SUFFIX=".genomic.fna"
# =================================
ts() { date +"%F %T"; }
log() { echo "[$(ts)] $*"; }
die() { echo "[$(ts)] ERROR: $*" >&2; exit 1; }
# --- checks ---
log "Checking required commands..."
for cmd in fastANI awk sort unzip find grep wc head readlink; do
command -v "$cmd" >/dev/null 2>&1 || die "Missing command: $cmd"
done
command -v datasets >/dev/null 2>&1 || die "Missing NCBI datasets CLI. Install from NCBI Datasets."
[[ -f "$QUERY" ]] || die "QUERY not found: $QUERY"
[[ -f "$ACC_LIST" ]] || die "Accession list not found: $ACC_LIST"
log "QUERY: $QUERY"
log "ACC_LIST: $ACC_LIST"
log "OUTDIR: $OUTDIR"
log "THREADS: $THREADS"
mkdir -p "$OUTDIR/ref_fasta" "$OUTDIR/zips" "$OUTDIR/tmp" "$OUTDIR/logs"
REF_LIST="$OUTDIR/ref_list.txt"
QUERY_LIST="$OUTDIR/query_list.txt"
RAW_OUT="$OUTDIR/fastani_raw.tsv"
FINAL_OUT="$OUTDIR/fastani_results.tsv"
DL_LOG="$OUTDIR/logs/download.log"
ANI_LOG="$OUTDIR/logs/fastani.log"
: > "$REF_LIST"
: > "$DL_LOG"
: > "$ANI_LOG"
# --- build query list ---
q_abs="$(readlink -f "$QUERY")"
echo "$q_abs" > "$QUERY_LIST"
log "Wrote query list: $QUERY_LIST"
log " -> $q_abs"
# --- download refs ---
log "Downloading reference genomes via NCBI datasets..."
n_ok=0
n_skip=0
while read -r acc; do
[[ -z "$acc" ]] && continue
[[ "$acc" =~ ^# ]] && continue
log "Ref: $acc"
zip="$OUTDIR/zips/${acc}.zip"
unpack="$OUTDIR/tmp/$acc"
out_fna="$OUTDIR/ref_fasta/${acc}${SUFFIX}"
# download zip
log " - datasets download -> $zip"
if datasets download genome accession "$acc" --include genome --filename "$zip" >>"$DL_LOG" 2>&1; then
log " - download OK"
else
log " - download FAILED (see $DL_LOG), skipping $acc"
n_skip=$((n_skip+1))
continue
fi
# unzip
rm -rf "$unpack"
mkdir -p "$unpack"
log " - unzip -> $unpack"
if unzip -q "$zip" -d "$unpack" >>"$DL_LOG" 2>&1; then
log " - unzip OK"
else
log " - unzip FAILED (see $DL_LOG), skipping $acc"
n_skip=$((n_skip+1))
continue
fi
# find genomic.fna (兼容不同包结构:优先找 genomic.fna,其次找任何 .fna)
fna="$(find "$unpack" -type f \( -name "*genomic.fna" -o -name "*genomic.fna.gz" \) | head -n 1 || true)"
if [[ -z "${fna:-}" ]]; then
log " - genomic.fna not found, try any *.fna"
fna="$(find "$unpack" -type f -name "*.fna" | head -n 1 || true)"
fi
if [[ -z "${fna:-}" ]]; then
log " - FAILED to find any .fna in package (see $DL_LOG). skipping $acc"
n_skip=$((n_skip+1))
continue
fi
# handle gz if needed
if [[ "$fna" == *.gz ]]; then
log " - found gzipped fasta: $(basename "$fna"), gunzip -> $out_fna"
gunzip -c "$fna" > "$out_fna"
else
log " - found fasta: $(basename "$fna"), copy -> $out_fna"
cp -f "$fna" "$out_fna"
fi
# sanity check fasta looks non-empty
if [[ ! -s "$out_fna" ]]; then
log " - output fasta is empty, skipping $acc"
n_skip=$((n_skip+1))
continue
fi
echo "$(readlink -f "$out_fna")" >> "$REF_LIST"
n_ok=$((n_ok+1))
log " - saved ref fasta OK"
done < "$ACC_LIST"
log "Download summary: OK=$n_ok, skipped=$n_skip"
log "Ref list written: $REF_LIST ($(wc -l < "$REF_LIST") refs)"
if [[ "$(wc -l < "$REF_LIST")" -eq 0 ]]; then
die "No references available. Check $DL_LOG"
fi
# --- run fastANI ---
log "Running fastANI..."
log "Command:"
log " fastANI -ql $QUERY_LIST -rl $REF_LIST -t $THREADS -o $RAW_OUT"
# 重要:不要吞掉错误信息,把 stdout/stderr 进日志
if fastANI -ql "$QUERY_LIST" -rl "$REF_LIST" -t "$THREADS" -o "$RAW_OUT" >>"$ANI_LOG" 2>&1; then
log "fastANI finished (see $ANI_LOG)"
else
log "fastANI FAILED (see $ANI_LOG)"
die "fastANI failed. Inspect $ANI_LOG"
fi
# --- verify raw output ---
if [[ ! -f "$RAW_OUT" ]]; then
die "fastANI did not create $RAW_OUT. Check $ANI_LOG"
fi
if [[ ! -s "$RAW_OUT" ]]; then
die "fastANI output is empty ($RAW_OUT). Check $ANI_LOG; also verify fasta validity."
fi
log "fastANI raw output: $RAW_OUT ($(wc -l < "$RAW_OUT") lines)"
log "Sample lines:"
head -n 5 "$RAW_OUT" || true
# --- create final table ---
log "Creating final TSV with header..."
echo -e "Query\tReference\tANI\tMatchedFrag\tTotalFrag" > "$FINAL_OUT"
awk 'BEGIN{OFS="\t"} {print $1,$2,$3,$4,$5}' "$RAW_OUT" >> "$FINAL_OUT"
log "Final results: $FINAL_OUT ($(wc -l < "$FINAL_OUT") lines incl. header)"
log "Top hits (ANI desc):"
tail -n +2 "$FINAL_OUT" | sort -k3,3nr | head -n 10 || true
log "DONE."
log "Logs:"
log " download log: $DL_LOG"
log " fastANI log: $ANI_LOG" 




