protein2ipr data source display

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

protein2ipr data source display

Pengcheng Yang

Hi miners,

I want to load the protein2ipr data, and display the content as those in humanmine:


I known that the domain data of humanmine were from Uniprot. However, I have the same content of the data for our species. How to load the protein2ipr data and display them?

According to the intermine_documentation, I have loaded interpro and protein2ipr data source. However, when I query the database with phrase: "select * from proteindomainregion;", it returned 0 rows. Is it need to write a customizable data source for this purpose?

Best,

Pengcheng





_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: protein2ipr data source display

Julie Sullivan-2
Did you have proteins loaded in your database already from UniProt?

The protein2ipr source queries for proteins in the database then loads
the associated domains. If you haven't first run UniProt, no extra data
will be stored.

# is this protein present in your database? (from uniprot)
select * from protein where primaryaccession = 'Q9U943';

# is this domain present in your database? (from protein2ipr)
select * from proteindomain where identifier = 'IPR001747';

I've attached a snippet of the data file. If you have that protein
loaded, the protein2ipr source should have loaded these domains.

1. Can you verify you ran UniProt, then protein2ipr?
2. Can you verify your local data file has these data?

On 04/09/2018 11:28 AM, Pengcheng Yang wrote:

> Hi miners,
>
> I want to load the protein2ipr data, and display the content as those in
> humanmine:
>
>
> I known that the domain data of humanmine were from Uniprot. However, I
> have the same content of the data for our species. How to load the
> protein2ipr data and display them?
>
> According to the intermine_documentation, I have loaded interpro and
> protein2ipr data source. However, when I query the database with phrase:
> "select * from proteindomainregion;", it returned 0 rows. Is it need to
> write a customizable data source for this purpose?
>
> Best,
>
> Pengcheng
>
>
>
>
>
>
> _______________________________________________
> dev mailing list
> [hidden email]
> https://lists.intermine.org/mailman/listinfo/dev
>

_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev

locust-domains (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: protein2ipr data source display

Pengcheng Yang
Hi Julie,

I have loaded protein sequences in fasta format, then interpro, then
protein2ipr. The protein2ipr data was prepared with the format:
ProteinID<tb>IPRID<tb>Description<tb>DomainDBID<tb>Start<tb>End.

It should be the reason that I haven't loaded Uniprot, which caused the
protein domain being not  displayed.

However, as a species without genome data in Uniprot, how can I load
protein2ipr data into the Mine? Is there existing data source could be used?

Best,

Pengcheng



On 2018-4-9 18:59, Julie Sullivan wrote:

> Did you have proteins loaded in your database already from UniProt?
>
> The protein2ipr source queries for proteins in the database then loads
> the associated domains. If you haven't first run UniProt, no extra
> data will be stored.
>
> # is this protein present in your database? (from uniprot)
> select * from protein where primaryaccession = 'Q9U943';
>
> # is this domain present in your database? (from protein2ipr)
> select * from proteindomain where identifier = 'IPR001747';
>
> I've attached a snippet of the data file. If you have that protein
> loaded, the protein2ipr source should have loaded these domains.
>
> 1. Can you verify you ran UniProt, then protein2ipr?
> 2. Can you verify your local data file has these data?
>
> On 04/09/2018 11:28 AM, Pengcheng Yang wrote:
>> Hi miners,
>>
>> I want to load the protein2ipr data, and display the content as those
>> in humanmine:
>>
>>
>> I known that the domain data of humanmine were from Uniprot. However,
>> I have the same content of the data for our species. How to load the
>> protein2ipr data and display them?
>>
>> According to the intermine_documentation, I have loaded interpro and
>> protein2ipr data source. However, when I query the database with
>> phrase: "select * from proteindomainregion;", it returned 0 rows. Is
>> it need to write a customizable data source for this purpose?
>>
>> Best,
>>
>> Pengcheng
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> dev mailing list
>> [hidden email]
>> https://lists.intermine.org/mailman/listinfo/dev
>>

_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: protein2ipr data source display

Julie Sullivan-2
Sorry, that was poorly worded. You can load proteins from anywhere! Not
just UniProt.

So if your FASTA has the same accessions as interpro, you should be
fine. What do your proteins look like?

Here's the query the interpro source runs:

https://github.com/intermine/intermine/blob/dev/bio/sources/protein2ipr/main/src/org/intermine/bio/dataconversion/Protein2iprConverter.java#L219

Step 1: The query is returning the "primary accession" for all proteins
and creating a big map called "proteinIds".

Step 2: Then the source loops through the interpro file and, if there is
a match on primary accession (from step 1), it stores the associated
protein domain and region.

On 04/09/2018 03:36 PM, Pengcheng Yang wrote:

> Hi Julie,
>
> I have loaded protein sequences in fasta format, then interpro, then
> protein2ipr. The protein2ipr data was prepared with the format:
> ProteinID<tb>IPRID<tb>Description<tb>DomainDBID<tb>Start<tb>End.
>
> It should be the reason that I haven't loaded Uniprot, which caused the
> protein domain being not  displayed.
>
> However, as a species without genome data in Uniprot, how can I load
> protein2ipr data into the Mine? Is there existing data source could be
> used?
>
> Best,
>
> Pengcheng
>
>
>
> On 2018-4-9 18:59, Julie Sullivan wrote:
>> Did you have proteins loaded in your database already from UniProt?
>>
>> The protein2ipr source queries for proteins in the database then loads
>> the associated domains. If you haven't first run UniProt, no extra
>> data will be stored.
>>
>> # is this protein present in your database? (from uniprot)
>> select * from protein where primaryaccession = 'Q9U943';
>>
>> # is this domain present in your database? (from protein2ipr)
>> select * from proteindomain where identifier = 'IPR001747';
>>
>> I've attached a snippet of the data file. If you have that protein
>> loaded, the protein2ipr source should have loaded these domains.
>>
>> 1. Can you verify you ran UniProt, then protein2ipr?
>> 2. Can you verify your local data file has these data?
>>
>> On 04/09/2018 11:28 AM, Pengcheng Yang wrote:
>>> Hi miners,
>>>
>>> I want to load the protein2ipr data, and display the content as those
>>> in humanmine:
>>>
>>>
>>> I known that the domain data of humanmine were from Uniprot. However,
>>> I have the same content of the data for our species. How to load the
>>> protein2ipr data and display them?
>>>
>>> According to the intermine_documentation, I have loaded interpro and
>>> protein2ipr data source. However, when I query the database with
>>> phrase: "select * from proteindomainregion;", it returned 0 rows. Is
>>> it need to write a customizable data source for this purpose?
>>>
>>> Best,
>>>
>>> Pengcheng
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> dev mailing list
>>> [hidden email]
>>> https://lists.intermine.org/mailman/listinfo/dev
>>>
>
_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: protein2ipr data source display

Pengcheng Yang
Hi Julie,

1. The protein sequence file is just in fasta format, like:

 >LOCMI11198-p1  [mRNA]  locus=scaffold10286:2195655:2270421:-
[translate_table: standard]
MSKRTVRKRLAALHSLQLPKPLPPRYSYEQIQEIAQFLLERTKIRPKIGI
ICGTGLGELVESVTERQVFPYEDIPGFPVSTVVGHAGKLVFGVLNGVNVM
CMQGRFHAFEGYPVWKCAMPVRVMKLCGVTHLIATNAAGGISEHLEVGDI
MIIKDHINLLGLFGNSPLIGPNDERFGPRFPATVKSYDKELRRKVASVAK
EMGISNIIKEGVYCAQGGPAFETVAEIRCLKTMGVDAIGMSTVHEVVTAR
HCGLTVVGLSLITNRCVLSYEEEEEEEVVTHESVIAVSASRARLLQQLVC
RLVPVVLQA

2. the project.xml file for load protein, interpro, protein2ipr

  <source name="locust-protein-fasta" type="fasta" dump="true">
       <property name="fasta.className"
value="org.intermine.model.bio.Protein"/>
       <property name="fasta.sequenceType" value="protein"/>
       <property name="fasta.dataSourceName" value="LocustGenomeProject"/>
       <property name="fasta.dataSetTitle" value="Locust protein
sequences"/>
       <property name="fasta.taxonId" value="7004"/>
       <property name="fasta.includes" value="Locust.V2.4.1.pep"/>
       <property name="src.data.dir"
location="/home/pengchy/Project/LocustMine/testdata/genome/protein"/>
     </source>

     <source name="interpro" type="interpro" dump="true">
       <property name="src.data.dir"
location="/home/pengchy/Project/LocustMine/testdata/interpro/"/>
     </source>

     <source name="protein2ipr" type="protein2ipr" dump="true">
       <property name="src.data.dir"
location="/home/pengchy/Project/LocustMine/testdata/protein2ipr/"/>
       <property name="src.data.dir.includes"
value="Locust.V2.4.1.protein2ipr"/>
       <property name="protein2ipr.organisms" value="7004"/>
     </source>


3. the loading was performed using project_build and successfully
finished with the log information:


starting command: ant -v -Dsource=locust-protein-fasta

Mon Apr  9 10:07:13 CST 2018

finished

action locust-protein-fasta took 14 seconds

Mon Apr  9 10:07:13 CST 2018

starting command: ant -v -Dsource=interpro

Mon Apr  9 10:13:12 CST 2018

finished


action interpro took 64 seconds

Mon Apr  9 10:13:12 CST 2018

starting command: ant -v -Dsource=protein2ipr

Mon Apr  9 10:13:39 CST 2018

finished


action protein2ipr took 22 seconds

Mon Apr  9 10:13:39 CST 2018

4. However, after successful releasing, the protein domain was not
displayed and the "proteindomainregion" table in the mine database is empty.

The accession of the protein fasta file is the same to the uniprot2ipr
file. So, I wonder is it possible due to the taxonomy ID? Because "the
parser will only load the proteins of the specified species which are
already loaded into the mine."

Or any other information I could provided for the debugging.

Thank you.

Best,

Pengcheng


On 2018-4-9 23:06, Julie Sullivan wrote:

> Sorry, that was poorly worded. You can load proteins from anywhere!
> Not just UniProt.
>
> So if your FASTA has the same accessions as interpro, you should be
> fine. What do your proteins look like?
>
> Here's the query the interpro source runs:
>
> https://github.com/intermine/intermine/blob/dev/bio/sources/protein2ipr/main/src/org/intermine/bio/dataconversion/Protein2iprConverter.java#L219 
>
>
> Step 1: The query is returning the "primary accession" for all
> proteins and creating a big map called "proteinIds".
>
> Step 2: Then the source loops through the interpro file and, if there
> is a match on primary accession (from step 1), it stores the
> associated protein domain and region.
>
> On 04/09/2018 03:36 PM, Pengcheng Yang wrote:
>> Hi Julie,
>>
>> I have loaded protein sequences in fasta format, then interpro, then
>> protein2ipr. The protein2ipr data was prepared with the format:
>> ProteinID<tb>IPRID<tb>Description<tb>DomainDBID<tb>Start<tb>End.
>>
>> It should be the reason that I haven't loaded Uniprot, which caused
>> the protein domain being not  displayed.
>>
>> However, as a species without genome data in Uniprot, how can I load
>> protein2ipr data into the Mine? Is there existing data source could
>> be used?
>>
>> Best,
>>
>> Pengcheng
>>
>>
>>
>> On 2018-4-9 18:59, Julie Sullivan wrote:
>>> Did you have proteins loaded in your database already from UniProt?
>>>
>>> The protein2ipr source queries for proteins in the database then
>>> loads the associated domains. If you haven't first run UniProt, no
>>> extra data will be stored.
>>>
>>> # is this protein present in your database? (from uniprot)
>>> select * from protein where primaryaccession = 'Q9U943';
>>>
>>> # is this domain present in your database? (from protein2ipr)
>>> select * from proteindomain where identifier = 'IPR001747';
>>>
>>> I've attached a snippet of the data file. If you have that protein
>>> loaded, the protein2ipr source should have loaded these domains.
>>>
>>> 1. Can you verify you ran UniProt, then protein2ipr?
>>> 2. Can you verify your local data file has these data?
>>>
>>> On 04/09/2018 11:28 AM, Pengcheng Yang wrote:
>>>> Hi miners,
>>>>
>>>> I want to load the protein2ipr data, and display the content as
>>>> those in humanmine:
>>>>
>>>>
>>>> I known that the domain data of humanmine were from Uniprot.
>>>> However, I have the same content of the data for our species. How
>>>> to load the protein2ipr data and display them?
>>>>
>>>> According to the intermine_documentation, I have loaded interpro
>>>> and protein2ipr data source. However, when I query the database
>>>> with phrase: "select * from proteindomainregion;", it returned 0
>>>> rows. Is it need to write a customizable data source for this purpose?
>>>>
>>>> Best,
>>>>
>>>> Pengcheng
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> dev mailing list
>>>> [hidden email]
>>>> https://lists.intermine.org/mailman/listinfo/dev
>>>>
>>
>

_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: protein2ipr data source display

Julie Sullivan-2
Pengcheng

This returns nothing for me:

        grep LOCMI11198 protein2ipr.dat

But you are saying the protein accession in your FASTA file _does_ match
what's in protein2ipr.dat? Is that right?

Can you give me an example accession?

Because as far as I can tell, your accessions in the FASTA do not match
the accessions in the interpro file, e.g. "Q9U943". Which is why you are
seeing no data.

Please let me know if I am wrong!

Julie

On 04/09/2018 04:36 PM, Pengcheng Yang wrote:

> Hi Julie,
>
> 1. The protein sequence file is just in fasta format, like:
>
>  >LOCMI11198-p1  [mRNA]  locus=scaffold10286:2195655:2270421:-
> [translate_table: standard]
> MSKRTVRKRLAALHSLQLPKPLPPRYSYEQIQEIAQFLLERTKIRPKIGI
> ICGTGLGELVESVTERQVFPYEDIPGFPVSTVVGHAGKLVFGVLNGVNVM
> CMQGRFHAFEGYPVWKCAMPVRVMKLCGVTHLIATNAAGGISEHLEVGDI
> MIIKDHINLLGLFGNSPLIGPNDERFGPRFPATVKSYDKELRRKVASVAK
> EMGISNIIKEGVYCAQGGPAFETVAEIRCLKTMGVDAIGMSTVHEVVTAR
> HCGLTVVGLSLITNRCVLSYEEEEEEEVVTHESVIAVSASRARLLQQLVC
> RLVPVVLQA
>
> 2. the project.xml file for load protein, interpro, protein2ipr
>
>   <source name="locust-protein-fasta" type="fasta" dump="true">
>        <property name="fasta.className"
> value="org.intermine.model.bio.Protein"/>
>        <property name="fasta.sequenceType" value="protein"/>
>        <property name="fasta.dataSourceName" value="LocustGenomeProject"/>
>        <property name="fasta.dataSetTitle" value="Locust protein
> sequences"/>
>        <property name="fasta.taxonId" value="7004"/>
>        <property name="fasta.includes" value="Locust.V2.4.1.pep"/>
>        <property name="src.data.dir"
> location="/home/pengchy/Project/LocustMine/testdata/genome/protein"/>
>      </source>
>
>      <source name="interpro" type="interpro" dump="true">
>        <property name="src.data.dir"
> location="/home/pengchy/Project/LocustMine/testdata/interpro/"/>
>      </source>
>
>      <source name="protein2ipr" type="protein2ipr" dump="true">
>        <property name="src.data.dir"
> location="/home/pengchy/Project/LocustMine/testdata/protein2ipr/"/>
>        <property name="src.data.dir.includes"
> value="Locust.V2.4.1.protein2ipr"/>
>        <property name="protein2ipr.organisms" value="7004"/>
>      </source>
>
>
> 3. the loading was performed using project_build and successfully
> finished with the log information:
>
>
> starting command: ant -v -Dsource=locust-protein-fasta
>
> Mon Apr  9 10:07:13 CST 2018
>
> finished
>
> action locust-protein-fasta took 14 seconds
>
> Mon Apr  9 10:07:13 CST 2018
>
> starting command: ant -v -Dsource=interpro
>
> Mon Apr  9 10:13:12 CST 2018
>
> finished
>
>
> action interpro took 64 seconds
>
> Mon Apr  9 10:13:12 CST 2018
>
> starting command: ant -v -Dsource=protein2ipr
>
> Mon Apr  9 10:13:39 CST 2018
>
> finished
>
>
> action protein2ipr took 22 seconds
>
> Mon Apr  9 10:13:39 CST 2018
>
> 4. However, after successful releasing, the protein domain was not
> displayed and the "proteindomainregion" table in the mine database is
> empty.
>
> The accession of the protein fasta file is the same to the uniprot2ipr
> file. So, I wonder is it possible due to the taxonomy ID? Because "the
> parser will only load the proteins of the specified species which are
> already loaded into the mine."
>
> Or any other information I could provided for the debugging.
>
> Thank you.
>
> Best,
>
> Pengcheng
>
>
> On 2018-4-9 23:06, Julie Sullivan wrote:
>> Sorry, that was poorly worded. You can load proteins from anywhere!
>> Not just UniProt.
>>
>> So if your FASTA has the same accessions as interpro, you should be
>> fine. What do your proteins look like?
>>
>> Here's the query the interpro source runs:
>>
>> https://github.com/intermine/intermine/blob/dev/bio/sources/protein2ipr/main/src/org/intermine/bio/dataconversion/Protein2iprConverter.java#L219 
>>
>>
>> Step 1: The query is returning the "primary accession" for all
>> proteins and creating a big map called "proteinIds".
>>
>> Step 2: Then the source loops through the interpro file and, if there
>> is a match on primary accession (from step 1), it stores the
>> associated protein domain and region.
>>
>> On 04/09/2018 03:36 PM, Pengcheng Yang wrote:
>>> Hi Julie,
>>>
>>> I have loaded protein sequences in fasta format, then interpro, then
>>> protein2ipr. The protein2ipr data was prepared with the format:
>>> ProteinID<tb>IPRID<tb>Description<tb>DomainDBID<tb>Start<tb>End.
>>>
>>> It should be the reason that I haven't loaded Uniprot, which caused
>>> the protein domain being not  displayed.
>>>
>>> However, as a species without genome data in Uniprot, how can I load
>>> protein2ipr data into the Mine? Is there existing data source could
>>> be used?
>>>
>>> Best,
>>>
>>> Pengcheng
>>>
>>>
>>>
>>> On 2018-4-9 18:59, Julie Sullivan wrote:
>>>> Did you have proteins loaded in your database already from UniProt?
>>>>
>>>> The protein2ipr source queries for proteins in the database then
>>>> loads the associated domains. If you haven't first run UniProt, no
>>>> extra data will be stored.
>>>>
>>>> # is this protein present in your database? (from uniprot)
>>>> select * from protein where primaryaccession = 'Q9U943';
>>>>
>>>> # is this domain present in your database? (from protein2ipr)
>>>> select * from proteindomain where identifier = 'IPR001747';
>>>>
>>>> I've attached a snippet of the data file. If you have that protein
>>>> loaded, the protein2ipr source should have loaded these domains.
>>>>
>>>> 1. Can you verify you ran UniProt, then protein2ipr?
>>>> 2. Can you verify your local data file has these data?
>>>>
>>>> On 04/09/2018 11:28 AM, Pengcheng Yang wrote:
>>>>> Hi miners,
>>>>>
>>>>> I want to load the protein2ipr data, and display the content as
>>>>> those in humanmine:
>>>>>
>>>>>
>>>>> I known that the domain data of humanmine were from Uniprot.
>>>>> However, I have the same content of the data for our species. How
>>>>> to load the protein2ipr data and display them?
>>>>>
>>>>> According to the intermine_documentation, I have loaded interpro
>>>>> and protein2ipr data source. However, when I query the database
>>>>> with phrase: "select * from proteindomainregion;", it returned 0
>>>>> rows. Is it need to write a customizable data source for this purpose?
>>>>>
>>>>> Best,
>>>>>
>>>>> Pengcheng
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> dev mailing list
>>>>> [hidden email]
>>>>> https://lists.intermine.org/mailman/listinfo/dev
>>>>>
>>>
>>
>
_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: protein2ipr data source display

Pengcheng Yang
Hi Julie,

Sorry for my mistake. I mean that the protein2ipr file is produced by
myself. The protein domain information was annotated using
InterProScan.I have prepared the protein2ipr file of our species
according to the format of the protein2ipr.dat, which downloaded from
interpro.

The fact is. We have sequenced a genome and annotated the protein domain
using InterProScan software. Now, I want to load our protein domain data
into the Mine. Is it possible to load the self produced protein2ipr data
into the mine using the protein2ipr data source?

Again, sorry for the confusion.

Best,

Pengcheng


On 2018-4-10 15:44, Julie Sullivan wrote:

> Pengcheng
>
> This returns nothing for me:
>
>     grep LOCMI11198 protein2ipr.dat
>
> But you are saying the protein accession in your FASTA file _does_
> match what's in protein2ipr.dat? Is that right?
>
> Can you give me an example accession?
>
> Because as far as I can tell, your accessions in the FASTA do not
> match the accessions in the interpro file, e.g. "Q9U943". Which is why
> you are seeing no data.
>
> Please let me know if I am wrong!
>
> Julie
>
> On 04/09/2018 04:36 PM, Pengcheng Yang wrote:
>> Hi Julie,
>>
>> 1. The protein sequence file is just in fasta format, like:
>>
>>  >LOCMI11198-p1  [mRNA] locus=scaffold10286:2195655:2270421:-
>> [translate_table: standard]
>> MSKRTVRKRLAALHSLQLPKPLPPRYSYEQIQEIAQFLLERTKIRPKIGI
>> ICGTGLGELVESVTERQVFPYEDIPGFPVSTVVGHAGKLVFGVLNGVNVM
>> CMQGRFHAFEGYPVWKCAMPVRVMKLCGVTHLIATNAAGGISEHLEVGDI
>> MIIKDHINLLGLFGNSPLIGPNDERFGPRFPATVKSYDKELRRKVASVAK
>> EMGISNIIKEGVYCAQGGPAFETVAEIRCLKTMGVDAIGMSTVHEVVTAR
>> HCGLTVVGLSLITNRCVLSYEEEEEEEVVTHESVIAVSASRARLLQQLVC
>> RLVPVVLQA
>>
>> 2. the project.xml file for load protein, interpro, protein2ipr
>>
>>   <source name="locust-protein-fasta" type="fasta" dump="true">
>>        <property name="fasta.className"
>> value="org.intermine.model.bio.Protein"/>
>>        <property name="fasta.sequenceType" value="protein"/>
>>        <property name="fasta.dataSourceName"
>> value="LocustGenomeProject"/>
>>        <property name="fasta.dataSetTitle" value="Locust protein
>> sequences"/>
>>        <property name="fasta.taxonId" value="7004"/>
>>        <property name="fasta.includes" value="Locust.V2.4.1.pep"/>
>>        <property name="src.data.dir"
>> location="/home/pengchy/Project/LocustMine/testdata/genome/protein"/>
>>      </source>
>>
>>      <source name="interpro" type="interpro" dump="true">
>>        <property name="src.data.dir"
>> location="/home/pengchy/Project/LocustMine/testdata/interpro/"/>
>>      </source>
>>
>>      <source name="protein2ipr" type="protein2ipr" dump="true">
>>        <property name="src.data.dir"
>> location="/home/pengchy/Project/LocustMine/testdata/protein2ipr/"/>
>>        <property name="src.data.dir.includes"
>> value="Locust.V2.4.1.protein2ipr"/>
>>        <property name="protein2ipr.organisms" value="7004"/>
>>      </source>
>>
>>
>> 3. the loading was performed using project_build and successfully
>> finished with the log information:
>>
>>
>> starting command: ant -v -Dsource=locust-protein-fasta
>>
>> Mon Apr  9 10:07:13 CST 2018
>>
>> finished
>>
>> action locust-protein-fasta took 14 seconds
>>
>> Mon Apr  9 10:07:13 CST 2018
>>
>> starting command: ant -v -Dsource=interpro
>>
>> Mon Apr  9 10:13:12 CST 2018
>>
>> finished
>>
>>
>> action interpro took 64 seconds
>>
>> Mon Apr  9 10:13:12 CST 2018
>>
>> starting command: ant -v -Dsource=protein2ipr
>>
>> Mon Apr  9 10:13:39 CST 2018
>>
>> finished
>>
>>
>> action protein2ipr took 22 seconds
>>
>> Mon Apr  9 10:13:39 CST 2018
>>
>> 4. However, after successful releasing, the protein domain was not
>> displayed and the "proteindomainregion" table in the mine database is
>> empty.
>>
>> The accession of the protein fasta file is the same to the
>> uniprot2ipr file. So, I wonder is it possible due to the taxonomy ID?
>> Because "the parser will only load the proteins of the specified
>> species which are already loaded into the mine."
>>
>> Or any other information I could provided for the debugging.
>>
>> Thank you.
>>
>> Best,
>>
>> Pengcheng
>>
>>
>> On 2018-4-9 23:06, Julie Sullivan wrote:
>>> Sorry, that was poorly worded. You can load proteins from anywhere!
>>> Not just UniProt.
>>>
>>> So if your FASTA has the same accessions as interpro, you should be
>>> fine. What do your proteins look like?
>>>
>>> Here's the query the interpro source runs:
>>>
>>> https://github.com/intermine/intermine/blob/dev/bio/sources/protein2ipr/main/src/org/intermine/bio/dataconversion/Protein2iprConverter.java#L219 
>>>
>>>
>>> Step 1: The query is returning the "primary accession" for all
>>> proteins and creating a big map called "proteinIds".
>>>
>>> Step 2: Then the source loops through the interpro file and, if
>>> there is a match on primary accession (from step 1), it stores the
>>> associated protein domain and region.
>>>
>>> On 04/09/2018 03:36 PM, Pengcheng Yang wrote:
>>>> Hi Julie,
>>>>
>>>> I have loaded protein sequences in fasta format, then interpro,
>>>> then protein2ipr. The protein2ipr data was prepared with the
>>>> format:
>>>> ProteinID<tb>IPRID<tb>Description<tb>DomainDBID<tb>Start<tb>End.
>>>>
>>>> It should be the reason that I haven't loaded Uniprot, which caused
>>>> the protein domain being not  displayed.
>>>>
>>>> However, as a species without genome data in Uniprot, how can I
>>>> load protein2ipr data into the Mine? Is there existing data source
>>>> could be used?
>>>>
>>>> Best,
>>>>
>>>> Pengcheng
>>>>
>>>>
>>>>
>>>> On 2018-4-9 18:59, Julie Sullivan wrote:
>>>>> Did you have proteins loaded in your database already from UniProt?
>>>>>
>>>>> The protein2ipr source queries for proteins in the database then
>>>>> loads the associated domains. If you haven't first run UniProt, no
>>>>> extra data will be stored.
>>>>>
>>>>> # is this protein present in your database? (from uniprot)
>>>>> select * from protein where primaryaccession = 'Q9U943';
>>>>>
>>>>> # is this domain present in your database? (from protein2ipr)
>>>>> select * from proteindomain where identifier = 'IPR001747';
>>>>>
>>>>> I've attached a snippet of the data file. If you have that protein
>>>>> loaded, the protein2ipr source should have loaded these domains.
>>>>>
>>>>> 1. Can you verify you ran UniProt, then protein2ipr?
>>>>> 2. Can you verify your local data file has these data?
>>>>>
>>>>> On 04/09/2018 11:28 AM, Pengcheng Yang wrote:
>>>>>> Hi miners,
>>>>>>
>>>>>> I want to load the protein2ipr data, and display the content as
>>>>>> those in humanmine:
>>>>>>
>>>>>>
>>>>>> I known that the domain data of humanmine were from Uniprot.
>>>>>> However, I have the same content of the data for our species. How
>>>>>> to load the protein2ipr data and display them?
>>>>>>
>>>>>> According to the intermine_documentation, I have loaded interpro
>>>>>> and protein2ipr data source. However, when I query the database
>>>>>> with phrase: "select * from proteindomainregion;", it returned 0
>>>>>> rows. Is it need to write a customizable data source for this
>>>>>> purpose?
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Pengcheng
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> dev mailing list
>>>>>> [hidden email]
>>>>>> https://lists.intermine.org/mailman/listinfo/dev
>>>>>>
>>>>
>>>
>>
>

_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: protein2ipr data source display

Julie Sullivan-2
No worries! I am glad you have the correct file.

Yes, you should be able to use the same data parser if the data file is
in the same format. I am assuming the files are the same as they are
produced by the same software.

And yes, you are right, it could be the taxon ID. That's the only other
table in the query:

<query name="" model="genomic" view="Protein.primaryAccession"
longDescription="" sortOrder="Protein.primaryAccession asc">
   <constraint path="Protein.organism.taxonId" op="=" value="7004"/>
</query>

What results does that give you? Are those values (the primary
accessions) also in your interpro data file?

On 04/10/2018 09:18 AM, Pengcheng Yang wrote:

> Hi Julie,
>
> Sorry for my mistake. I mean that the protein2ipr file is produced by
> myself. The protein domain information was annotated using
> InterProScan.I have prepared the protein2ipr file of our species
> according to the format of the protein2ipr.dat, which downloaded from
> interpro.
>
> The fact is. We have sequenced a genome and annotated the protein domain
> using InterProScan software. Now, I want to load our protein domain data
> into the Mine. Is it possible to load the self produced protein2ipr data
> into the mine using the protein2ipr data source?
>
> Again, sorry for the confusion.
>
> Best,
>
> Pengcheng
>
>
> On 2018-4-10 15:44, Julie Sullivan wrote:
>> Pengcheng
>>
>> This returns nothing for me:
>>
>>     grep LOCMI11198 protein2ipr.dat
>>
>> But you are saying the protein accession in your FASTA file _does_
>> match what's in protein2ipr.dat? Is that right?
>>
>> Can you give me an example accession?
>>
>> Because as far as I can tell, your accessions in the FASTA do not
>> match the accessions in the interpro file, e.g. "Q9U943". Which is why
>> you are seeing no data.
>>
>> Please let me know if I am wrong!
>>
>> Julie
>>
>> On 04/09/2018 04:36 PM, Pengcheng Yang wrote:
>>> Hi Julie,
>>>
>>> 1. The protein sequence file is just in fasta format, like:
>>>
>>>  >LOCMI11198-p1  [mRNA] locus=scaffold10286:2195655:2270421:-
>>> [translate_table: standard]
>>> MSKRTVRKRLAALHSLQLPKPLPPRYSYEQIQEIAQFLLERTKIRPKIGI
>>> ICGTGLGELVESVTERQVFPYEDIPGFPVSTVVGHAGKLVFGVLNGVNVM
>>> CMQGRFHAFEGYPVWKCAMPVRVMKLCGVTHLIATNAAGGISEHLEVGDI
>>> MIIKDHINLLGLFGNSPLIGPNDERFGPRFPATVKSYDKELRRKVASVAK
>>> EMGISNIIKEGVYCAQGGPAFETVAEIRCLKTMGVDAIGMSTVHEVVTAR
>>> HCGLTVVGLSLITNRCVLSYEEEEEEEVVTHESVIAVSASRARLLQQLVC
>>> RLVPVVLQA
>>>
>>> 2. the project.xml file for load protein, interpro, protein2ipr
>>>
>>>   <source name="locust-protein-fasta" type="fasta" dump="true">
>>>        <property name="fasta.className"
>>> value="org.intermine.model.bio.Protein"/>
>>>        <property name="fasta.sequenceType" value="protein"/>
>>>        <property name="fasta.dataSourceName"
>>> value="LocustGenomeProject"/>
>>>        <property name="fasta.dataSetTitle" value="Locust protein
>>> sequences"/>
>>>        <property name="fasta.taxonId" value="7004"/>
>>>        <property name="fasta.includes" value="Locust.V2.4.1.pep"/>
>>>        <property name="src.data.dir"
>>> location="/home/pengchy/Project/LocustMine/testdata/genome/protein"/>
>>>      </source>
>>>
>>>      <source name="interpro" type="interpro" dump="true">
>>>        <property name="src.data.dir"
>>> location="/home/pengchy/Project/LocustMine/testdata/interpro/"/>
>>>      </source>
>>>
>>>      <source name="protein2ipr" type="protein2ipr" dump="true">
>>>        <property name="src.data.dir"
>>> location="/home/pengchy/Project/LocustMine/testdata/protein2ipr/"/>
>>>        <property name="src.data.dir.includes"
>>> value="Locust.V2.4.1.protein2ipr"/>
>>>        <property name="protein2ipr.organisms" value="7004"/>
>>>      </source>
>>>
>>>
>>> 3. the loading was performed using project_build and successfully
>>> finished with the log information:
>>>
>>>
>>> starting command: ant -v -Dsource=locust-protein-fasta
>>>
>>> Mon Apr  9 10:07:13 CST 2018
>>>
>>> finished
>>>
>>> action locust-protein-fasta took 14 seconds
>>>
>>> Mon Apr  9 10:07:13 CST 2018
>>>
>>> starting command: ant -v -Dsource=interpro
>>>
>>> Mon Apr  9 10:13:12 CST 2018
>>>
>>> finished
>>>
>>>
>>> action interpro took 64 seconds
>>>
>>> Mon Apr  9 10:13:12 CST 2018
>>>
>>> starting command: ant -v -Dsource=protein2ipr
>>>
>>> Mon Apr  9 10:13:39 CST 2018
>>>
>>> finished
>>>
>>>
>>> action protein2ipr took 22 seconds
>>>
>>> Mon Apr  9 10:13:39 CST 2018
>>>
>>> 4. However, after successful releasing, the protein domain was not
>>> displayed and the "proteindomainregion" table in the mine database is
>>> empty.
>>>
>>> The accession of the protein fasta file is the same to the
>>> uniprot2ipr file. So, I wonder is it possible due to the taxonomy ID?
>>> Because "the parser will only load the proteins of the specified
>>> species which are already loaded into the mine."
>>>
>>> Or any other information I could provided for the debugging.
>>>
>>> Thank you.
>>>
>>> Best,
>>>
>>> Pengcheng
>>>
>>>
>>> On 2018-4-9 23:06, Julie Sullivan wrote:
>>>> Sorry, that was poorly worded. You can load proteins from anywhere!
>>>> Not just UniProt.
>>>>
>>>> So if your FASTA has the same accessions as interpro, you should be
>>>> fine. What do your proteins look like?
>>>>
>>>> Here's the query the interpro source runs:
>>>>
>>>> https://github.com/intermine/intermine/blob/dev/bio/sources/protein2ipr/main/src/org/intermine/bio/dataconversion/Protein2iprConverter.java#L219 
>>>>
>>>>
>>>> Step 1: The query is returning the "primary accession" for all
>>>> proteins and creating a big map called "proteinIds".
>>>>
>>>> Step 2: Then the source loops through the interpro file and, if
>>>> there is a match on primary accession (from step 1), it stores the
>>>> associated protein domain and region.
>>>>
>>>> On 04/09/2018 03:36 PM, Pengcheng Yang wrote:
>>>>> Hi Julie,
>>>>>
>>>>> I have loaded protein sequences in fasta format, then interpro,
>>>>> then protein2ipr. The protein2ipr data was prepared with the
>>>>> format:
>>>>> ProteinID<tb>IPRID<tb>Description<tb>DomainDBID<tb>Start<tb>End.
>>>>>
>>>>> It should be the reason that I haven't loaded Uniprot, which caused
>>>>> the protein domain being not  displayed.
>>>>>
>>>>> However, as a species without genome data in Uniprot, how can I
>>>>> load protein2ipr data into the Mine? Is there existing data source
>>>>> could be used?
>>>>>
>>>>> Best,
>>>>>
>>>>> Pengcheng
>>>>>
>>>>>
>>>>>
>>>>> On 2018-4-9 18:59, Julie Sullivan wrote:
>>>>>> Did you have proteins loaded in your database already from UniProt?
>>>>>>
>>>>>> The protein2ipr source queries for proteins in the database then
>>>>>> loads the associated domains. If you haven't first run UniProt, no
>>>>>> extra data will be stored.
>>>>>>
>>>>>> # is this protein present in your database? (from uniprot)
>>>>>> select * from protein where primaryaccession = 'Q9U943';
>>>>>>
>>>>>> # is this domain present in your database? (from protein2ipr)
>>>>>> select * from proteindomain where identifier = 'IPR001747';
>>>>>>
>>>>>> I've attached a snippet of the data file. If you have that protein
>>>>>> loaded, the protein2ipr source should have loaded these domains.
>>>>>>
>>>>>> 1. Can you verify you ran UniProt, then protein2ipr?
>>>>>> 2. Can you verify your local data file has these data?
>>>>>>
>>>>>> On 04/09/2018 11:28 AM, Pengcheng Yang wrote:
>>>>>>> Hi miners,
>>>>>>>
>>>>>>> I want to load the protein2ipr data, and display the content as
>>>>>>> those in humanmine:
>>>>>>>
>>>>>>>
>>>>>>> I known that the domain data of humanmine were from Uniprot.
>>>>>>> However, I have the same content of the data for our species. How
>>>>>>> to load the protein2ipr data and display them?
>>>>>>>
>>>>>>> According to the intermine_documentation, I have loaded interpro
>>>>>>> and protein2ipr data source. However, when I query the database
>>>>>>> with phrase: "select * from proteindomainregion;", it returned 0
>>>>>>> rows. Is it need to write a customizable data source for this
>>>>>>> purpose?
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Pengcheng
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> dev mailing list
>>>>>>> [hidden email]
>>>>>>> https://lists.intermine.org/mailman/listinfo/dev
>>>>>>>
>>>>>
>>>>
>>>
>>
>
_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: protein2ipr data source display

Pengcheng Yang
Yes, this query returned all the protein entries. This suggests that all
the protein sequences has taxonomy attributes 7004.

But the returned table don't have Primary Accession (NO VALE). So, this
is the reason that the protein has not be related to the protein domain
data. How to set the "Primary Accession" of these protein sequences? I
loaded these protein sequences in fasta format.


On 2018-4-10 16:27, Julie Sullivan wrote:

> No worries! I am glad you have the correct file.
>
> Yes, you should be able to use the same data parser if the data file
> is in the same format. I am assuming the files are the same as they
> are produced by the same software.
>
> And yes, you are right, it could be the taxon ID. That's the only
> other table in the query:
>
> <query name="" model="genomic" view="Protein.primaryAccession"
> longDescription="" sortOrder="Protein.primaryAccession asc">
>   <constraint path="Protein.organism.taxonId" op="=" value="7004"/>
> </query>
>
> What results does that give you? Are those values (the primary
> accessions) also in your interpro data file?
>
> On 04/10/2018 09:18 AM, Pengcheng Yang wrote:
>> Hi Julie,
>>
>> Sorry for my mistake. I mean that the protein2ipr file is produced by
>> myself. The protein domain information was annotated using
>> InterProScan.I have prepared the protein2ipr file of our species
>> according to the format of the protein2ipr.dat, which downloaded from
>> interpro.
>>
>> The fact is. We have sequenced a genome and annotated the protein
>> domain using InterProScan software. Now, I want to load our protein
>> domain data into the Mine. Is it possible to load the self produced
>> protein2ipr data into the mine using the protein2ipr data source?
>>
>> Again, sorry for the confusion.
>>
>> Best,
>>
>> Pengcheng
>>
>>
>> On 2018-4-10 15:44, Julie Sullivan wrote:
>>> Pengcheng
>>>
>>> This returns nothing for me:
>>>
>>>     grep LOCMI11198 protein2ipr.dat
>>>
>>> But you are saying the protein accession in your FASTA file _does_
>>> match what's in protein2ipr.dat? Is that right?
>>>
>>> Can you give me an example accession?
>>>
>>> Because as far as I can tell, your accessions in the FASTA do not
>>> match the accessions in the interpro file, e.g. "Q9U943". Which is
>>> why you are seeing no data.
>>>
>>> Please let me know if I am wrong!
>>>
>>> Julie
>>>
>>> On 04/09/2018 04:36 PM, Pengcheng Yang wrote:
>>>> Hi Julie,
>>>>
>>>> 1. The protein sequence file is just in fasta format, like:
>>>>
>>>>  >LOCMI11198-p1  [mRNA] locus=scaffold10286:2195655:2270421:-
>>>> [translate_table: standard]
>>>> MSKRTVRKRLAALHSLQLPKPLPPRYSYEQIQEIAQFLLERTKIRPKIGI
>>>> ICGTGLGELVESVTERQVFPYEDIPGFPVSTVVGHAGKLVFGVLNGVNVM
>>>> CMQGRFHAFEGYPVWKCAMPVRVMKLCGVTHLIATNAAGGISEHLEVGDI
>>>> MIIKDHINLLGLFGNSPLIGPNDERFGPRFPATVKSYDKELRRKVASVAK
>>>> EMGISNIIKEGVYCAQGGPAFETVAEIRCLKTMGVDAIGMSTVHEVVTAR
>>>> HCGLTVVGLSLITNRCVLSYEEEEEEEVVTHESVIAVSASRARLLQQLVC
>>>> RLVPVVLQA
>>>>
>>>> 2. the project.xml file for load protein, interpro, protein2ipr
>>>>
>>>>   <source name="locust-protein-fasta" type="fasta" dump="true">
>>>>        <property name="fasta.className"
>>>> value="org.intermine.model.bio.Protein"/>
>>>>        <property name="fasta.sequenceType" value="protein"/>
>>>>        <property name="fasta.dataSourceName"
>>>> value="LocustGenomeProject"/>
>>>>        <property name="fasta.dataSetTitle" value="Locust protein
>>>> sequences"/>
>>>>        <property name="fasta.taxonId" value="7004"/>
>>>>        <property name="fasta.includes" value="Locust.V2.4.1.pep"/>
>>>>        <property name="src.data.dir"
>>>> location="/home/pengchy/Project/LocustMine/testdata/genome/protein"/>
>>>>      </source>
>>>>
>>>>      <source name="interpro" type="interpro" dump="true">
>>>>        <property name="src.data.dir"
>>>> location="/home/pengchy/Project/LocustMine/testdata/interpro/"/>
>>>>      </source>
>>>>
>>>>      <source name="protein2ipr" type="protein2ipr" dump="true">
>>>>        <property name="src.data.dir"
>>>> location="/home/pengchy/Project/LocustMine/testdata/protein2ipr/"/>
>>>>        <property name="src.data.dir.includes"
>>>> value="Locust.V2.4.1.protein2ipr"/>
>>>>        <property name="protein2ipr.organisms" value="7004"/>
>>>>      </source>
>>>>
>>>>
>>>> 3. the loading was performed using project_build and successfully
>>>> finished with the log information:
>>>>
>>>>
>>>> starting command: ant -v -Dsource=locust-protein-fasta
>>>>
>>>> Mon Apr  9 10:07:13 CST 2018
>>>>
>>>> finished
>>>>
>>>> action locust-protein-fasta took 14 seconds
>>>>
>>>> Mon Apr  9 10:07:13 CST 2018
>>>>
>>>> starting command: ant -v -Dsource=interpro
>>>>
>>>> Mon Apr  9 10:13:12 CST 2018
>>>>
>>>> finished
>>>>
>>>>
>>>> action interpro took 64 seconds
>>>>
>>>> Mon Apr  9 10:13:12 CST 2018
>>>>
>>>> starting command: ant -v -Dsource=protein2ipr
>>>>
>>>> Mon Apr  9 10:13:39 CST 2018
>>>>
>>>> finished
>>>>
>>>>
>>>> action protein2ipr took 22 seconds
>>>>
>>>> Mon Apr  9 10:13:39 CST 2018
>>>>
>>>> 4. However, after successful releasing, the protein domain was not
>>>> displayed and the "proteindomainregion" table in the mine database
>>>> is empty.
>>>>
>>>> The accession of the protein fasta file is the same to the
>>>> uniprot2ipr file. So, I wonder is it possible due to the taxonomy
>>>> ID? Because "the parser will only load the proteins of the
>>>> specified species which are already loaded into the mine."
>>>>
>>>> Or any other information I could provided for the debugging.
>>>>
>>>> Thank you.
>>>>
>>>> Best,
>>>>
>>>> Pengcheng
>>>>
>>>>
>>>> On 2018-4-9 23:06, Julie Sullivan wrote:
>>>>> Sorry, that was poorly worded. You can load proteins from
>>>>> anywhere! Not just UniProt.
>>>>>
>>>>> So if your FASTA has the same accessions as interpro, you should
>>>>> be fine. What do your proteins look like?
>>>>>
>>>>> Here's the query the interpro source runs:
>>>>>
>>>>> https://github.com/intermine/intermine/blob/dev/bio/sources/protein2ipr/main/src/org/intermine/bio/dataconversion/Protein2iprConverter.java#L219 
>>>>>
>>>>>
>>>>> Step 1: The query is returning the "primary accession" for all
>>>>> proteins and creating a big map called "proteinIds".
>>>>>
>>>>> Step 2: Then the source loops through the interpro file and, if
>>>>> there is a match on primary accession (from step 1), it stores the
>>>>> associated protein domain and region.
>>>>>
>>>>> On 04/09/2018 03:36 PM, Pengcheng Yang wrote:
>>>>>> Hi Julie,
>>>>>>
>>>>>> I have loaded protein sequences in fasta format, then interpro,
>>>>>> then protein2ipr. The protein2ipr data was prepared with the
>>>>>> format:
>>>>>> ProteinID<tb>IPRID<tb>Description<tb>DomainDBID<tb>Start<tb>End.
>>>>>>
>>>>>> It should be the reason that I haven't loaded Uniprot, which
>>>>>> caused the protein domain being not  displayed.
>>>>>>
>>>>>> However, as a species without genome data in Uniprot, how can I
>>>>>> load protein2ipr data into the Mine? Is there existing data
>>>>>> source could be used?
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Pengcheng
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2018-4-9 18:59, Julie Sullivan wrote:
>>>>>>> Did you have proteins loaded in your database already from UniProt?
>>>>>>>
>>>>>>> The protein2ipr source queries for proteins in the database then
>>>>>>> loads the associated domains. If you haven't first run UniProt,
>>>>>>> no extra data will be stored.
>>>>>>>
>>>>>>> # is this protein present in your database? (from uniprot)
>>>>>>> select * from protein where primaryaccession = 'Q9U943';
>>>>>>>
>>>>>>> # is this domain present in your database? (from protein2ipr)
>>>>>>> select * from proteindomain where identifier = 'IPR001747';
>>>>>>>
>>>>>>> I've attached a snippet of the data file. If you have that
>>>>>>> protein loaded, the protein2ipr source should have loaded these
>>>>>>> domains.
>>>>>>>
>>>>>>> 1. Can you verify you ran UniProt, then protein2ipr?
>>>>>>> 2. Can you verify your local data file has these data?
>>>>>>>
>>>>>>> On 04/09/2018 11:28 AM, Pengcheng Yang wrote:
>>>>>>>> Hi miners,
>>>>>>>>
>>>>>>>> I want to load the protein2ipr data, and display the content as
>>>>>>>> those in humanmine:
>>>>>>>>
>>>>>>>>
>>>>>>>> I known that the domain data of humanmine were from Uniprot.
>>>>>>>> However, I have the same content of the data for our species.
>>>>>>>> How to load the protein2ipr data and display them?
>>>>>>>>
>>>>>>>> According to the intermine_documentation, I have loaded
>>>>>>>> interpro and protein2ipr data source. However, when I query the
>>>>>>>> database with phrase: "select * from proteindomainregion;", it
>>>>>>>> returned 0 rows. Is it need to write a customizable data source
>>>>>>>> for this purpose?
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Pengcheng
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> dev mailing list
>>>>>>>> [hidden email]
>>>>>>>> https://lists.intermine.org/mailman/listinfo/dev
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev