diff --git a/README.md b/README.md index f95d47bedbb9715ff6f993d3ff94301bff8a6bac..4fa5b2a93c7dffbc2d4bac013f059c2fc15fe1a4 100644 --- a/README.md +++ b/README.md @@ -112,6 +112,12 @@ Other parameters are available: Other options: --outdir The output directory where the results will be saved. --help Show the help message and exit. + + Skip + + --skip_sickle Skip sickle process. + --skip_kaiju_index Skip built of kaiju database (index_db_kaiju process). + ``` ## Generated files @@ -121,8 +127,27 @@ The pipeline will create the following files in your working directory: ``` * work # Directory containing the nextflow working files * results # Directory containing result files + ** results/01_Cleaned_raw_data: cleaned raw data files (after cutadapt or cutadapt+sickle and after human reads removing removing) + ** results/02_Quality_control: multiQC file + ** results/03_Classification_Kaiju: index database files (if process index_db_kaiju not skipped) and kaiju files (kaiju result files, kronas, histograms, kaiju results for each node of taxonomy tree) + ** results/04_Assembly: assembly files and assembly metrics + ** results/05_Annotation: files .gff, .ffn, .fna, .faa, etc after prokka annotation and .gff, .ffn, .fna, .faa files with renamed contigs and genes + ** results/06_Clustering: cd-hit results for each sample, correspondance table of intermediate clusters and genes, cd-hit results at global level and correspondance table of global cluster and intermediate clusters (table_clstr.txt) + ** results/07_Quantification: .bam et .bam.bai file after reads alignment on contigs, .count files (featureCounts count), .summary file (featureCounts summary), .output file (featureCounts output), Correspondence_global_clstr_contigs.txt (correspondance table between global cluster and genes), Clusters_Count_table_all_samples.txt (quantification table of aligned reads for each global cluster and for each sample). + * .nextflow_log # Log file from Nextflow * # Other nextflow hidden files, eg. history of pipeline runs and old logs. +``` +# How to run demonstration on genologin cluster + +* Data test are available [here](https://forgemia.inra.fr/genotoul-bioinfo/metagwgs/tree/master/test) +* BWA index of human reference genome is available at /bank/bwadb/ensembl_homo_sapiens_genome +* kaiju database index file are avaiblable at /bank/kaijudb/kaijudb_Juin2019/ + +You can run the pipeline as follow: +```python +nextflow run -profile cluster_slurm main.nf --reads '*_{R1,R2}.fastq.gz' --assembly metaspades --skip_kaiju_index --kaiju_nodes /bank/kaijudb/kaijudb_Juin2019/nodes.dmp --kaiju_db /bank/kaijudb/kaijudb_Juin2019/refseq/kaiju_db_refseq.fmi --kaiju_names /bank/kaijudb/kaijudb_Juin2019/names.dmp + ``` # License @@ -135,5 +160,5 @@ metagWGS is distributed under the GNU General Public License v3. # Citation -metagWGS has not been published yet. +metagWGS will be presented at JOBIM 2019 ("Whole metagenome analysis with metagWGS").