Conversation
| cram_reference: cram_reference | ||
| out: [bam_file] | ||
| revert_bam: | ||
| run: ../tools/revert_input.cwl |
There was a problem hiding this comment.
Could revert_input.cwl be run on CRAMs directly? If so seems like input_to_bam could be removed. If not, seems like it should be named revert_bam.cwl instead of revert_input.cwl to denote its specificity.
(Is this operation time and space-efficient enough that we're willing to do it on all inputs instead of only those that need it?)
There was a problem hiding this comment.
The problem is the specificity, the tool in revert_input.cwl can be run on anything but you would have to use the input variable, currently bam, to grab the appropriate out file , basically would require some javascript i expect which i try to avoid in cwl. I think renaming to revert_bam.cwl is preferable from my perspective.
The flexibility in the PR does come at a computational cost for sure. I think we should run revert_input.cwl either way though so the cost is just converting whatever input to bam
|
Does @tmooney's recent code - that allows for Fastq or Bam input to the alignment workflow - provide a path forward here? #741 Seems like could be possible to expand the sequence type to have 4 input types instead of 2: Then we would add some downstream logic in the alignment steps to run RevertSam if needed. That would apply in (I think) just three different places: dna alignment, rnaseq alignment, bisulfite alignment |
This should allow the rnaseq.cwl workflow be input agnostic in terms of input being sam/cram/bam. I've also incorporated the suggestion from @jasonwalker80 on using
RevertSamwhich was not in here before.The best solution I could come up with was to force conversion to bam format right at the start. If you were instead to try and pass SAM/CRAM/BAM to
RevertSamthe output file and more importantly the extension would be incorrect (though if someone can figure this out maybe that's a better solution?).At any rate as it is now samtools is run and just converts to bam so while there are 3 input types only 1 output type averting this headache alluded to above.
Side note mgibio has 2 samtools docker images, updated at the same time. I don't know which one we use so if this is accepted those lines will have to be modified (right now it uses zlskidmore/samtools)