Re: Blast and GO

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Blast and GO

Stephen Ficklin-2
Hi Zhensheng,

You are getting the "Ambiguous" error messages with the blast module because it cannot uniquely resolve the features in your database.  For instance, the feature 'Znev_10135'is present more than once and the loader does not know which feature to assign the blast results.  To properly identify your featuress you must set the following fields when creating your blast content page so that the loader can uniquely identify your features: 

1.  Query Type:  Did you set the query type?  If you have multiple features with the same name or uniquename but of a different type you can resolve the problem by specifying the type of features that the blast results are for.  For example, if you have 'gene' and 'mRNA' type features and you blasted mRNA sequence then set this to be 'mRNA'.

2.  Use Unique name:  This is a checkbox where you can indicate to the loader whether it should use the feature 'name' or the feature 'unique name' for finding the feature.  For example, if you have multiple features (perhaps of the same type) that have the exact same name but different unique names, then you want to check this box so that the loader will try to match the unique name of the feature with the name in the input file.

3.  Query Name RE:  By default, the Tripal blast loader will look in the blast results and use the first word before a space or bar '|' as the name of  the feature.  If this is not the case you can provide a regular expression to pull out the proper name or unique name.  For example, suppose your features in Chado can be uniquely identified using the unique name, but the "name" of the feature is found first on the definition line of your FASTA file. In this case, the loader will try to match the name with a feature in the database.  If however you need it to match the unique name, but the unique name of your feature is somewhere else in the definition line (say the second word) then you'll need to enter a regular expression to pull out the correct unique name. 

You are getting the error when importing the GAF file because the file has an improper number of columns.  According to the GAF specification (http://www.geneontology.org/GO.format.gaf-2_0.shtml), you have to have at least 15 columns in your GAF input file.   Take a look at the GAF file you are importing and see. 

I hope this make sense, if not feel free to ask for more info.  Also, if this doesn't help could you respond and include the first few lines of your FASTA file so I can see the format and could you include a line or two of your GAF file?

Thanks,
Stephen

On 9/16/2011 6:39 AM, 陈镇生 wrote:
Hi Stephen,

I have some problem about blast and GO.

Blast:
dell@ubuntu:/var/www/drupal$ php ./sites/all/modules/tripal_core/tripal_launch_jobs.php btang
Tripal Job Launcher
-------------------
Calling: tripal_analysis_set_feature_permission(15, 37049, 82)
Updating feature permissions:
Calling: tripal_analysis_blast_parseXMLFile(15, 73, /home/gmod/Documents/Software/tripal/nr_10.blastp, all, , , , 1, 0, 83)
Parsing File:/home/gmod/Documents/Software/tripal/nr_10.blastp ...
Parsing all hits...
322 iterations to be processed.
Done.
Successful and failed entries have been saved in the log file:
 /tmp/tripal_analysis_blast_import5XO93z
Ambiguous: 'Znev_10135' matches more than one feature and is being skipped.
Ambiguous: 'Znev_03487' matches more than one feature and is being skipped.
Ambiguous: 'Znev_15786' matches more than one feature and is being skipped.
Ambiguous: 'Znev_05626' matches more than one feature and is being skipped.
Ambiguous: 'Znev_08294' matches more than one feature and is being skipped.
Ambiguous: 'Znev_04300' matches more than one feature and is being skipped.
Ambiguous: 'Znev_08565' matches more than one feature and is being skipped.
Ambiguous: 'Znev_09269' matches more than one feature and is being skipped.
Ambiguous: 'Znev_15605' matches more than one feature and is being skipped.
Ambiguous: 'Znev_09504' matches more than one feature and is being skipped.


GO:

dell@ubuntu:/var/www/drupal$ php ./sites/all/modules/tripal_core/tripal_launch_jobs.php btang
Tripal Job Launcher
-------------------
Calling: tripal_analysis_go_load_gaf(/home/dell/termite.GO, 13, 10, 1, , 0, , contig, 1, 86)
Opening GAF file /home/dell/termite.GO
ERROR: improper number of columns on line 1
Array
(
    [0] => Znev_00012
    [1] => GO:0004497;

)

How to deal with them ?

Thanks
zhensheng


------------------------------------------------------------------------------
BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA
http://p.sf.net/sfu/rim-devcon-copy2
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal
Reply | Threaded
Open this post in threaded view
|

Re: 答复: Blast and GO

Stephen Ficklin-2
Hi Zhensheng,

Great. I'm glad you got your blast data loaded! 

The example GO file you give below doesn't appear to be in the GAF format which is what you need to load into Tripal.  For your GO data, did you generate your GO file with Blast2GO?  Blast2GO can generate a GAF file.  If not, you'll need to format your file using the format described here: 
http://www.geneontology.org/GO.format.gaf-2_0.shtml

If you did not use Blast2GO and do not want to try to format your file in the GAF format, you can annotate your GFF file with your GO terms by adding an "Ontology_term" attribute to each feature.  For example:

Ontology_term=GO:0004788,GO:0016787

Then re-upload your GFF and it will update the features by adding the GO terms.

Stephen


On 9/18/2011 9:43 PM, 陈镇生 wrote:
Hi Stephen,

The blast is worked,but the GO isn't. My GO file only have several  columns,and how to get 15 columns? Can you give me an example or  Detailed step  ?  Thank you very much.

GO file:

Znev_00012      5       GO:0004497; monooxygenase activity; Molecular Function  GO:0005506; iron ion binding; Molecular Function      
Znev_00014      3       GO:0000226; microtubule cytoskeleton organization; Biological Process   GO:0000922; spindle pole; Cellular Component   
Znev_00015      1       GO:0005515; protein binding; Molecular Function
Znev_00017      2       GO:0003824; catalytic activity; Molecular Function      GO:0008152; metabolic process; Biological Process
Znev_00018      4       GO:0005215; transporter activity; Molecular Function    GO:0006810; transport; Biological Process    
Znev_00019      2       GO:0005515; protein binding; Molecular Function GO:0008270; zinc ion binding; Molecular Function
Znev_00020      5       GO:0005230; extracellular ligand-gated ion channel activity; Molecular Function GO:0006810; transport; Biological Process      
Znev_00021      3       GO:0005515; protein binding; Molecular Function GO:0007186; G-protein coupled receptor protein signaling pathway; Biological Process   
Znev_00024      3       GO:0003676; nucleic acid binding; Molecular Function    GO:0005622; intracellular; Cellular Component   GO:0008270; zinc ion binding; Molecular Function
Znev_00025      1       GO:0004871; signal transducer activity; Molecular Function
Znev_00030      6       GO:0005216; ion channel activity; Molecular Function    GO:0005247; voltage-gated chloride channel activity; Molecular Function GO:0005515; protein bindi
Znev_00031      2       GO:0003824; catalytic activity; Molecular Function      GO:0008152; metabolic process; Biological Process
Znev_00032      1       GO:0005515; protein binding; Molecular Function

Thanks,
Zhensheng


发件人: Stephen Ficklin [[hidden email]]
发送时间: 2011年9月16日 19:30
到: 陈镇生; [hidden email]
主题: Re: Blast and GO

Hi Zhensheng,

You are getting the "Ambiguous" error messages with the blast module because it cannot uniquely resolve the features in your database.  For instance, the feature 'Znev_10135'is present more than once and the loader does not know which feature to assign the blast results.  To properly identify your featuress you must set the following fields when creating your blast content page so that the loader can uniquely identify your features: 

1.  Query Type:  Did you set the query type?  If you have multiple features with the same name or uniquename but of a different type you can resolve the problem by specifying the type of features that the blast results are for.  For example, if you have 'gene' and 'mRNA' type features and you blasted mRNA sequence then set this to be 'mRNA'.

2.  Use Unique name:  This is a checkbox where you can indicate to the loader whether it should use the feature 'name' or the feature 'unique name' for finding the feature.  For example, if you have multiple features (perhaps of the same type) that have the exact same name but different unique names, then you want to check this box so that the loader will try to match the unique name of the feature with the name in the input file.

3.  Query Name RE:  By default, the Tripal blast loader will look in the blast results and use the first word before a space or bar '|' as the name of  the feature.  If this is not the case you can provide a regular expression to pull out the proper name or unique name.  For example, suppose your features in Chado can be uniquely identified using the unique name, but the "name" of the feature is found first on the definition line of your FASTA file. In this case, the loader will try to match the name with a feature in the database.  If however you need it to match the unique name, but the unique name of your feature is somewhere else in the definition line (say the second word) then you'll need to enter a regular expression to pull out the correct unique name. 

You are getting the error when importing the GAF file because the file has an improper number of columns.  According to the GAF specification (http://www.geneontology.org/GO.format.gaf-2_0.shtml), you have to have at least 15 columns in your GAF input file.   Take a look at the GAF file you are importing and see. 

I hope this make sense, if not feel free to ask for more info.  Also, if this doesn't help could you respond and include the first few lines of your FASTA file so I can see the format and could you include a line or two of your GAF file?

Thanks,
Stephen

On 9/16/2011 6:39 AM, 陈镇生 wrote:
Hi Stephen,

I have some problem about blast and GO.

Blast:
dell@ubuntu:/var/www/drupal$ php ./sites/all/modules/tripal_core/tripal_launch_jobs.php btang
Tripal Job Launcher
-------------------
Calling: tripal_analysis_set_feature_permission(15, 37049, 82)
Updating feature permissions:
Calling: tripal_analysis_blast_parseXMLFile(15, 73, /home/gmod/Documents/Software/tripal/nr_10.blastp, all, , , , 1, 0, 83)
Parsing File:/home/gmod/Documents/Software/tripal/nr_10.blastp ...
Parsing all hits...
322 iterations to be processed.
Done.
Successful and failed entries have been saved in the log file:
 /tmp/tripal_analysis_blast_import5XO93z
Ambiguous: 'Znev_10135' matches more than one feature and is being skipped.
Ambiguous: 'Znev_03487' matches more than one feature and is being skipped.
Ambiguous: 'Znev_15786' matches more than one feature and is being skipped.
Ambiguous: 'Znev_05626' matches more than one feature and is being skipped.
Ambiguous: 'Znev_08294' matches more than one feature and is being skipped.
Ambiguous: 'Znev_04300' matches more than one feature and is being skipped.
Ambiguous: 'Znev_08565' matches more than one feature and is being skipped.
Ambiguous: 'Znev_09269' matches more than one feature and is being skipped.
Ambiguous: 'Znev_15605' matches more than one feature and is being skipped.
Ambiguous: 'Znev_09504' matches more than one feature and is being skipped.


GO:

dell@ubuntu:/var/www/drupal$ php ./sites/all/modules/tripal_core/tripal_launch_jobs.php btang
Tripal Job Launcher
-------------------
Calling: tripal_analysis_go_load_gaf(/home/dell/termite.GO, 13, 10, 1, , 0, , contig, 1, 86)
Opening GAF file /home/dell/termite.GO
ERROR: improper number of columns on line 1
Array
(
    [0] => Znev_00012
    [1] => GO:0004497;

)

How to deal with them ?

Thanks
zhensheng



------------------------------------------------------------------------------
BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA
Learn about the latest advances in developing for the
BlackBerry® mobile platform with sessions, labs & more.
See new tools and technologies. Register for BlackBerry® DevCon today!
http://p.sf.net/sfu/rim-devcon-copy1 
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal
Reply | Threaded
Open this post in threaded view
|

Re: 答复: 答复: Blast and GO

Stephen Ficklin-2
Hi Zhensheng,

When you run Blast2GO, you can export your GO annotations in GAF format.   You can do this in Blast2GO under File -> Export -> Export Annotations -> Export Annotations in Go Annotation File Format (GAF v.2).  The resulting GAF file should be in the correct format.

Stepen

On 9/19/2011 7:54 AM, 陈镇生 wrote:
Hi Stephen,

The Blast2GO file :

Znev_08376  [mRNA] [translate_table: standard]  GO:0043161      proteasome ( macropain) 26s non- 5
Znev_08376  [mRNA] [translate_table: standard]  GO:0051325      proteasome ( macropain) 26s non- 5
Znev_08376  [mRNA] [translate_table: standard]  GO:0007050      proteasome ( macropain) 26s non- 5
Znev_08376  [mRNA] [translate_table: standard]  GO:0016070      proteasome ( macropain) 26s non- 5
Znev_08376  [mRNA] [translate_table: standard]  GO:0051439      proteasome ( macropain) 26s non- 5
Znev_08376  [mRNA] [translate_table: standard]  GO:0006915      proteasome ( macropain) 26s non- 5
Znev_08376  [mRNA] [translate_table: standard]  GO:0005488      proteasome ( macropain) 26s non- 5
Znev_08376  [mRNA] [translate_table: standard]  GO:0005838      proteasome ( macropain) 26s non- 5
Znev_08403  [mRNA] [translate_table: standard]  GO:0043231      creb-binding protein
Znev_08403  [mRNA] [translate_table: standard]  GO:0003712      creb-binding protein
Znev_08403  [mRNA] [translate_table: standard]  GO:0004468      creb-binding protein
Znev_08403  [mRNA] [translate_table: standard]  GO:0046914      creb-binding protein
Znev_08403  [mRNA] [translate_table: standard]  GO:0010468      creb-binding protein
Znev_08403  [mRNA] [translate_table: standard]  GO:0006351      creb-binding protein
Znev_08403  [mRNA] [translate_table: standard]  GO:0010557      creb-binding protein
Znev_09529  [mRNA] [translate_table: standard]  GO:0016020      zinc transporter zip1-like
Znev_09529  [mRNA] [translate_table: standard]  GO:0030001      zinc transporter zip1-like

How to change it to GAF file?

Thanks,
Zhensheng

发件人: Stephen Ficklin [[hidden email]]
发送时间: 2011年9月19日 19:07
到: 陈镇生; [hidden email]
主题: Re: 答复: Blast and GO

Hi Zhensheng,

Great. I'm glad you got your blast data loaded! 

The example GO file you give below doesn't appear to be in the GAF format which is what you need to load into Tripal.  For your GO data, did you generate your GO file with Blast2GO?  Blast2GO can generate a GAF file.  If not, you'll need to format your file using the format described here: 
http://www.geneontology.org/GO.format.gaf-2_0.shtml

If you did not use Blast2GO and do not want to try to format your file in the GAF format, you can annotate your GFF file with your GO terms by adding an "Ontology_term" attribute to each feature.  For example:

Ontology_term=GO:0004788,GO:0016787

Then re-upload your GFF and it will update the features by adding the GO terms.

Stephen


On 9/18/2011 9:43 PM, 陈镇生 wrote:
Hi Stephen,

The blast is worked,but the GO isn't. My GO file only have several  columns,and how to get 15 columns? Can you give me an example or  Detailed step  ?  Thank you very much.

GO file:

Znev_00012      5       GO:0004497; monooxygenase activity; Molecular Function  GO:0005506; iron ion binding; Molecular Function      
Znev_00014      3       GO:0000226; microtubule cytoskeleton organization; Biological Process   GO:0000922; spindle pole; Cellular Component   
Znev_00015      1       GO:0005515; protein binding; Molecular Function
Znev_00017      2       GO:0003824; catalytic activity; Molecular Function      GO:0008152; metabolic process; Biological Process
Znev_00018      4       GO:0005215; transporter activity; Molecular Function    GO:0006810; transport; Biological Process    
Znev_00019      2       GO:0005515; protein binding; Molecular Function GO:0008270; zinc ion binding; Molecular Function
Znev_00020      5       GO:0005230; extracellular ligand-gated ion channel activity; Molecular Function GO:0006810; transport; Biological Process      
Znev_00021      3       GO:0005515; protein binding; Molecular Function GO:0007186; G-protein coupled receptor protein signaling pathway; Biological Process   
Znev_00024      3       GO:0003676; nucleic acid binding; Molecular Function    GO:0005622; intracellular; Cellular Component   GO:0008270; zinc ion binding; Molecular Function
Znev_00025      1       GO:0004871; signal transducer activity; Molecular Function
Znev_00030      6       GO:0005216; ion channel activity; Molecular Function    GO:0005247; voltage-gated chloride channel activity; Molecular Function GO:0005515; protein bindi
Znev_00031      2       GO:0003824; catalytic activity; Molecular Function      GO:0008152; metabolic process; Biological Process
Znev_00032      1       GO:0005515; protein binding; Molecular Function

Thanks,
Zhensheng


发件人: Stephen Ficklin [[hidden email]]
发送时间: 2011年9月16日 19:30
到: 陈镇生; [hidden email]
主题: Re: Blast and GO

Hi Zhensheng,

You are getting the "Ambiguous" error messages with the blast module because it cannot uniquely resolve the features in your database.  For instance, the feature 'Znev_10135'is present more than once and the loader does not know which feature to assign the blast results.  To properly identify your featuress you must set the following fields when creating your blast content page so that the loader can uniquely identify your features: 

1.  Query Type:  Did you set the query type?  If you have multiple features with the same name or uniquename but of a different type you can resolve the problem by specifying the type of features that the blast results are for.  For example, if you have 'gene' and 'mRNA' type features and you blasted mRNA sequence then set this to be 'mRNA'.

2.  Use Unique name:  This is a checkbox where you can indicate to the loader whether it should use the feature 'name' or the feature 'unique name' for finding the feature.  For example, if you have multiple features (perhaps of the same type) that have the exact same name but different unique names, then you want to check this box so that the loader will try to match the unique name of the feature with the name in the input file.

3.  Query Name RE:  By default, the Tripal blast loader will look in the blast results and use the first word before a space or bar '|' as the name of  the feature.  If this is not the case you can provide a regular expression to pull out the proper name or unique name.  For example, suppose your features in Chado can be uniquely identified using the unique name, but the "name" of the feature is found first on the definition line of your FASTA file. In this case, the loader will try to match the name with a feature in the database.  If however you need it to match the unique name, but the unique name of your feature is somewhere else in the definition line (say the second word) then you'll need to enter a regular expression to pull out the correct unique name. 

You are getting the error when importing the GAF file because the file has an improper number of columns.  According to the GAF specification (http://www.geneontology.org/GO.format.gaf-2_0.shtml), you have to have at least 15 columns in your GAF input file.   Take a look at the GAF file you are importing and see. 

I hope this make sense, if not feel free to ask for more info.  Also, if this doesn't help could you respond and include the first few lines of your FASTA file so I can see the format and could you include a line or two of your GAF file?

Thanks,
Stephen

On 9/16/2011 6:39 AM, 陈镇生 wrote:
Hi Stephen,

I have some problem about blast and GO.

Blast:
dell@ubuntu:/var/www/drupal$ php ./sites/all/modules/tripal_core/tripal_launch_jobs.php btang
Tripal Job Launcher
-------------------
Calling: tripal_analysis_set_feature_permission(15, 37049, 82)
Updating feature permissions:
Calling: tripal_analysis_blast_parseXMLFile(15, 73, /home/gmod/Documents/Software/tripal/nr_10.blastp, all, , , , 1, 0, 83)
Parsing File:/home/gmod/Documents/Software/tripal/nr_10.blastp ...
Parsing all hits...
322 iterations to be processed.
Done.
Successful and failed entries have been saved in the log file:
 /tmp/tripal_analysis_blast_import5XO93z
Ambiguous: 'Znev_10135' matches more than one feature and is being skipped.
Ambiguous: 'Znev_03487' matches more than one feature and is being skipped.
Ambiguous: 'Znev_15786' matches more than one feature and is being skipped.
Ambiguous: 'Znev_05626' matches more than one feature and is being skipped.
Ambiguous: 'Znev_08294' matches more than one feature and is being skipped.
Ambiguous: 'Znev_04300' matches more than one feature and is being skipped.
Ambiguous: 'Znev_08565' matches more than one feature and is being skipped.
Ambiguous: 'Znev_09269' matches more than one feature and is being skipped.
Ambiguous: 'Znev_15605' matches more than one feature and is being skipped.
Ambiguous: 'Znev_09504' matches more than one feature and is being skipped.


GO:

dell@ubuntu:/var/www/drupal$ php ./sites/all/modules/tripal_core/tripal_launch_jobs.php btang
Tripal Job Launcher
-------------------
Calling: tripal_analysis_go_load_gaf(/home/dell/termite.GO, 13, 10, 1, , 0, , contig, 1, 86)
Opening GAF file /home/dell/termite.GO
ERROR: improper number of columns on line 1
Array
(
    [0] => Znev_00012
    [1] => GO:0004497;

)

How to deal with them ?

Thanks
zhensheng




------------------------------------------------------------------------------
BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA
Learn about the latest advances in developing for the
BlackBerry® mobile platform with sessions, labs & more.
See new tools and technologies. Register for BlackBerry® DevCon today!
http://p.sf.net/sfu/rim-devcon-copy1 
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal