RNA-seq on sage

  1. TODO on sage

    check the alignment of the reads to the annotation which sent from Munich is very bad, using the reference X14112 instead, find the CMV-GFP in the genome. Using alignment to detect the overall alignment rate to X14112 and chrHsv1_s17.

  2. commands on sage

    #under sage
    ln -s /home/jhuang/Tools/nf-core-rnaseq-3.12.0/ rnaseq
    [jhuang@sage Data_Caroline_RNAseq_wt_timecourse] nextflow run rnaseq/main.nf --input samplesheet_wt_timecourse.csv --outdir results_GRCh38 --genome GRCh38 -profile test_full  -resume   --max_memory 256.GB --max_time 2400.h        --aligner 'star_salmon' --skip_multiqc
    
    [jhuang@sage Data_Caroline_RNAseq_wt_timecourse] nextflow run rnaseq/main.nf --input samplesheet_wt_timecourse.csv --outdir results_chrHsv1  --fasta chrHsv1_s17.fasta --gtf chrHsv1_s17.gtf  -profile test_full -resume  --max_memory 256.GB --max_time 2400.h     --save_reference    --aligner 'star_salmon'    --gtf_extra_attributes 'gene_id' --gtf_group_features 'transcript_id' --featurecounts_group_type 'gene_id' --featurecounts_feature_type 'transcript'  --skip_rseqc --skip_dupradar --skip_preseq --skip_biotype_qc --skip_deseq2_qc --skip_multiqc
    
    [jhuang@sage Data_Caroline_RNAseq_brain_organoids] nextflow run rnaseq/main.nf --input samplesheet_brain_organoids.12.csv --outdir results_GRCh38 --genome GRCh38 -profile test_full  -resume   --max_memory 256.GB --max_time 2400.h        --aligner 'star_salmon' --skip_multiqc
    
    [jhuang@sage Data_Caroline_RNAseq_brain_organoids] nextflow run rnaseq/main.nf --input samplesheet_brain_organoids.12.csv --outdir results_chrHsv1  --fasta chrHsv1_s17.fasta --gtf chrHsv1_s17.gtf  -profile test_full -resume  --max_memory 256.GB --max_time 2400.h     --save_reference    --aligner 'star_salmon'    --gtf_extra_attributes 'gene_id' --gtf_group_features 'transcript_id' --featurecounts_group_type 'gene_id' --featurecounts_feature_type 'transcript'  --skip_rseqc --skip_dupradar --skip_preseq --skip_biotype_qc --skip_deseq2_qc --skip_multiqc
    
    #Processing *.umi_extract.fastq.gz
    (rnaseq) [jhuang@sage Data_Manja_RNAseq_Organoids_Virus]$ nextflow run rnaseq/main.nf --input samplesheet.umi_extract.csv --outdir results_chrHsv1  --fasta chrHsv1_s17.fasta --gtf chrHsv1_s17.gtf  -profile test_full -resume  --max_memory 256.GB --max_time 2400.h     --save_reference    --aligner 'star_salmon'    --gtf_extra_attributes 'gene_id' --gtf_group_features 'transcript_id' --featurecounts_group_type 'gene_id' --featurecounts_feature_type 'transcript'  --skip_rseqc --skip_dupradar --skip_preseq --skip_biotype_qc --skip_deseq2_qc --skip_multiqc
    
    #Processing raw data prepared with umi protocol
    (rnaseq) [jhuang@sage Data_Manja_RNAseq_Organoids_Virus]$ nextflow run rnaseq/main.nf --input samplesheet.csv --outdir results_chrHsv1  --fasta chrHsv1_s17.fasta --gtf chrHsv1_s17.gtf  --with_umi --umitools_extract_method "regex" --umitools_bc_pattern "^(?P
    .{12}).*” –umitools_dedup_stats -profile test_full -resume –max_memory 256.GB –max_time 2400.h –save_reference –aligner ‘star_salmon’ –gtf_extra_attributes ‘gene_id’ –gtf_group_features ‘transcript_id’ –featurecounts_group_type ‘gene_id’ –featurecounts_feature_type ‘transcript’ –skip_rseqc –skip_dupradar –skip_preseq –skip_biotype_qc –skip_deseq2_qc –skip_multiqc –min_mapped_reads 0 #Debug the following error: added “–minAssignedFrags 0 \\” to modules/nf-core/salmon/quant/main.nf option “salmon quant” and added “–min_mapped_reads 0” in the nextflow command above #hits: 0; hits per frag: 0[2023-10-20 11:35:22.944] [jointLog] [warning] salmon was only able to assign 0 fragments to transcripts in the index, but the minimum number of required assigned fragments (–minAssignedFrags) was 1. This could be indicative of a mismatch between the reference and sample, or a very bad sample. You can change the –minAssignedFrags parameter to force salmon to quantify with fewer assigned fragments (must have at least 1). (rnaseq) [jhuang@sage Data_Denise_LT_RNAseq]$ nextflow run rnaseq/main.nf –input samplesheet.csv –outdir results_GRCh38 –genome GRCh38 -profile test_full -resume –max_memory 256.GB –max_time 2400.h –save_align_intermeds –save_unaligned –aligner ‘star_salmon’ –skip_multiqc (rnaseq) [jhuang@sage Data_Samira_RNAseq]$ nextflow run rnaseq/main.nf –input samplesheet.csv –outdir results_GRCh38 –genome GRCh38 -profile test_full -resume –max_memory 256.GB –max_time 2400.h –save_align_intermeds –save_unaligned –aligner ‘star_salmon’ (rnaseq) [jhuang@sage Data_Manja_RNAseq_Organoids]$ nextflow run rnaseq/main.nf –input samplesheet.csv –outdir results_GRCh38 –genome GRCh38 –with_umi –umitools_extract_method “regex” –umitools_bc_pattern “^(?P .{12}).*” -profile test_full -resume –max_memory 256.GB –max_time 2400.h –save_align_intermeds –save_unaligned –save_reference –aligner ‘star_salmon’ –pseudo_aligner ‘salmon’ (rnaseq) [jhuang@sage Data_Manja_RNAseq_Organoids_Virus]$ nextflow run rnaseq/main.nf –input samplesheet.csv –outdir results_chrHsv1_s17 –fasta “/home/jhuang/DATA/Data_Manja_RNAseq_Organoids_Virus/chrHsv1_s17.fasta” –gtf “/home/jhuang/DATA/Data_Manja_RNAseq_Organoids_Virus/chrHsv1_s17.gtf” –with_umi –umitools_extract_method “regex” –umitools_bc_pattern “^(?P .{12}).*” –umitools_dedup_stats –skip_rseqc –skip_dupradar –skip_preseq -profile test_full -resume –max_memory 256.GB –max_time 2400.h –save_align_intermeds –save_unaligned –save_reference –aligner ‘star_salmon’ –gtf_extra_attributes ‘gene_id’ –gtf_group_features ‘transcript_id’ –featurecounts_group_type ‘gene_id’ –featurecounts_feature_type ‘transcript’ –skip_multiqc ln -s ~/Tools/rnaseq/assets/multiqc_config.yaml multiqc_config.yaml multiqc -f –config multiqc_config.yaml . 2>&1 rm multiqc_config.yaml
  3. reference on sage

    /home/jhuang/REFs/Homo_sapiens/Ensembl/GRCh38
    /home/jhuang/REFs/Homo_sapiens/hg38-blacklist.bed
    
    #C3i Science Day – Novel approaches to study the immune-tissue interface

Leave a Reply

Your email address will not be published. Required fields are marked *