Using PacBio and Illumina in MAKER

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Using PacBio and Illumina in MAKER

Timo Metz
Hey guys,

I was wondering on what would be the best way to implement Pacbio long and assembled Illumina short reads into MAKER. PacBio reads have a higher confidence to find correct gene models as they do not need to be assembled, but I do not have enough PacBio reads available to construct an annotation solely based on PacBio reads. I also have tons of short reads available, but those are really short (30-40bp) so they are not very reliable. Is it a good idea to first do an annotation only with PacBio reads and Protein data and then do a "re-annotation" with Illumina reads in order to only identify "new" models that could be introduced by Illumina reads but let the old models intact? Are there any other suggestions or experiences?

best
Timo

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: Using PacBio and Illumina in MAKER

Daniel Ence-2
Hi Timo, first of all, are these RNA or DNAseq reads? If they are DNA, then the best use would be to improve your reference assembly as much as possible. If they are RNAseq, then you want to do whatever kind of assembly you can (trinity for example for illumina) with the illumina reads and the PacBio reads separately.

You can also use Evidence Modeler, which is compatible with more recent versions of MAKER, to assign weights to different datasets, so you can reflect the different confidence you have in your different datasets.

~Daniel

> On Apr 18, 2018, at 8:35 AM, Timo Metz <[hidden email]> wrote:
>
> Hey guys,
>
> I was wondering on what would be the best way to implement Pacbio long and assembled Illumina short reads into MAKER. PacBio reads have a higher confidence to find correct gene models as they do not need to be assembled, but I do not have enough PacBio reads available to construct an annotation solely based on PacBio reads. I also have tons of short reads available, but those are really short (30-40bp) so they are not very reliable. Is it a good idea to first do an annotation only with PacBio reads and Protein data and then do a "re-annotation" with Illumina reads in order to only identify "new" models that could be introduced by Illumina reads but let the old models intact? Are there any other suggestions or experiences?
>
> best
> Timo
> _______________________________________________
> maker-devel mailing list
> [hidden email]
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

smime.p7s (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Using PacBio and Illumina in MAKER

Carson Holt-2
I would add that if you have a PacBio assembly, you can use programs like Pilon to polish the long read PacBio assembly (it uses the high accuracy Illimuna reads to correct errors in the PacBio assembly). I would at least do that rather than using the PacBio assembly directly.

—Carson

> On Apr 18, 2018, at 9:25 AM, Daniel Ence <[hidden email]> wrote:
>
> Hi Timo, first of all, are these RNA or DNAseq reads? If they are DNA, then the best use would be to improve your reference assembly as much as possible. If they are RNAseq, then you want to do whatever kind of assembly you can (trinity for example for illumina) with the illumina reads and the PacBio reads separately.
>
> You can also use Evidence Modeler, which is compatible with more recent versions of MAKER, to assign weights to different datasets, so you can reflect the different confidence you have in your different datasets.
>
> ~Daniel
>
>> On Apr 18, 2018, at 8:35 AM, Timo Metz <[hidden email]> wrote:
>>
>> Hey guys,
>>
>> I was wondering on what would be the best way to implement Pacbio long and assembled Illumina short reads into MAKER. PacBio reads have a higher confidence to find correct gene models as they do not need to be assembled, but I do not have enough PacBio reads available to construct an annotation solely based on PacBio reads. I also have tons of short reads available, but those are really short (30-40bp) so they are not very reliable. Is it a good idea to first do an annotation only with PacBio reads and Protein data and then do a "re-annotation" with Illumina reads in order to only identify "new" models that could be introduced by Illumina reads but let the old models intact? Are there any other suggestions or experiences?
>>
>> best
>> Timo
>> _______________________________________________
>> maker-devel mailing list
>> [hidden email]
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
> _______________________________________________
> maker-devel mailing list
> [hidden email]
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: Using PacBio and Illumina in MAKER

Carson Holt-2
Relevant when using Pilon on mRNA-seq assemblies (you need to modify some command line options) —> https://github.com/broadinstitute/pilon/issues/50

—Carson

On Apr 18, 2018, at 9:29 AM, Carson Holt <[hidden email]> wrote:

I would add that if you have a PacBio assembly, you can use programs like Pilon to polish the long read PacBio assembly (it uses the high accuracy Illimuna reads to correct errors in the PacBio assembly). I would at least do that rather than using the PacBio assembly directly.

—Carson

On Apr 18, 2018, at 9:25 AM, Daniel Ence <[hidden email]> wrote:

Hi Timo, first of all, are these RNA or DNAseq reads? If they are DNA, then the best use would be to improve your reference assembly as much as possible. If they are RNAseq, then you want to do whatever kind of assembly you can (trinity for example for illumina) with the illumina reads and the PacBio reads separately.

You can also use Evidence Modeler, which is compatible with more recent versions of MAKER, to assign weights to different datasets, so you can reflect the different confidence you have in your different datasets.

~Daniel

On Apr 18, 2018, at 8:35 AM, Timo Metz <[hidden email]> wrote:

Hey guys,

I was wondering on what would be the best way to implement Pacbio long and assembled Illumina short reads into MAKER. PacBio reads have a higher confidence to find correct gene models as they do not need to be assembled, but I do not have enough PacBio reads available to construct an annotation solely based on PacBio reads. I also have tons of short reads available, but those are really short (30-40bp) so they are not very reliable. Is it a good idea to first do an annotation only with PacBio reads and Protein data and then do a "re-annotation" with Illumina reads in order to only identify "new" models that could be introduced by Illumina reads but let the old models intact? Are there any other suggestions or experiences?

best
Timo
_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org



_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org