Error running load_ncbi_taxonomy.pl with input file

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Error running load_ncbi_taxonomy.pl with input file

Nicolas Joannin
Hello,

I have encountered an error running load_ncbi_taxonomy.pl with in input file of txids.
You'll find the last few lines of the output below.

Do you have any idea what I might have done wrong?

Best regards,
Nicolas

Command line output:

### MUCH MORE BEFORE THIS ###
Found child_id 14104 (organism_id = 14132) 
walking the tree for id 14104, index count is 241
Setting left index= 242 for parent 14104

Found child_id 15654 (organism_id = 15682) 
walking the tree for id 15654, index count is 242
Setting left index= 243 for parent 15654

Setting right index= 244 for phylonode id 15654

Setting right index= 245 for phylonode id 14104

Setting right index= 246 for phylonode id 25556

Setting right index= 247 for phylonode id 25068

Updating the phylonode and phylonode_organism tables

An error occured! Rolling back! 
 DBIx::Class::Storage::DBI::__ANON__(): DBI Exception: DBD::Pg::db do failed: ERROR:  null value in column "left_idx" violates not-null constraint at load_ncbi_taxonomy.pl line 528
 
 Resetting database sequences...


Nicolas Joannin, Ph.D.
Bioinformatics Center
Kyoto University, Uji campus, Japan


------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Error running load_ncbi_taxonomy.pl with input file

Naama Menda-3
Hi Nicolas 

How did you generate your input file, and which options did you use for running the script?

Naama



On Monday, June 3, 2013, Nicolas Joannin wrote:
Hello,

I have encountered an error running load_ncbi_taxonomy.pl with in input file of txids.
You'll find the last few lines of the output below.

Do you have any idea what I might have done wrong?

Best regards,
Nicolas

Command line output:

### MUCH MORE BEFORE THIS ###
Found child_id 14104 (organism_id = 14132) 
walking the tree for id 14104, index count is 241
Setting left index= 242 for parent 14104

Found child_id 15654 (organism_id = 15682) 
walking the tree for id 15654, index count is 242
Setting left index= 243 for parent 15654

Setting right index= 244 for phylonode id 15654

Setting right index= 245 for phylonode id 14104

Setting right index= 246 for phylonode id 25556

Setting right index= 247 for phylonode id 25068

Updating the phylonode and phylonode_organism tables

An error occured! Rolling back! 
 DBIx::Class::Storage::DBI::__ANON__(): DBI Exception: DBD::Pg::db do failed: ERROR:  null value in column "left_idx" violates not-null constraint at load_ncbi_taxonomy.pl line 528
 
 Resetting database sequences...


Nicolas Joannin, Ph.D.
Bioinformatics Center
Kyoto University, Uji campus, Japan



--
Naama Menda
Boyce Thompson Institute for Plant Research
Tower Rd
Ithaca NY 14853
USA

(607) 254 3569
Sol Genomics Network
http://solgenomics.net/
[hidden email]

------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Error running load_ncbi_taxonomy.pl with input file

Nicolas Joannin
Hi Naama,

I generated the file through the NCBI Taxonomy webpage: I searched for all the taxons I wanted and downloaded their ids to a file. The input file is has one txid per line and ends with an empty line.
The script was run with like this: 
perl load_ncbi_taxonomy.pl -H host -D dbname -u username -d Pg -p password -i path/to/infile -v -t

Let me know if you need more info!

Best regards,
Nicolas



Nicolas Joannin, Ph.D.
Bioinformatics Center
Kyoto University, Uji campus, Japan



On Mon, Jun 3, 2013 at 7:27 PM, Naama Menda <[hidden email]> wrote:
Hi Nicolas 

How did you generate your input file, and which options did you use for running the script?

Naama



On Monday, June 3, 2013, Nicolas Joannin wrote:
Hello,

I have encountered an error running load_ncbi_taxonomy.pl with in input file of txids.
You'll find the last few lines of the output below.

Do you have any idea what I might have done wrong?

Best regards,
Nicolas

Command line output:

### MUCH MORE BEFORE THIS ###
Found child_id 14104 (organism_id = 14132) 
walking the tree for id 14104, index count is 241
Setting left index= 242 for parent 14104

Found child_id 15654 (organism_id = 15682) 
walking the tree for id 15654, index count is 242
Setting left index= 243 for parent 15654

Setting right index= 244 for phylonode id 15654

Setting right index= 245 for phylonode id 14104

Setting right index= 246 for phylonode id 25556

Setting right index= 247 for phylonode id 25068

Updating the phylonode and phylonode_organism tables

An error occured! Rolling back! 
 DBIx::Class::Storage::DBI::__ANON__(): DBI Exception: DBD::Pg::db do failed: ERROR:  null value in column "left_idx" violates not-null constraint at load_ncbi_taxonomy.pl line 528
 
 Resetting database sequences...


Nicolas Joannin, Ph.D.
Bioinformatics Center
Kyoto University, Uji campus, Japan



--
Naama Menda
Boyce Thompson Institute for Plant Research
Tower Rd
Ithaca NY 14853
USA

<a href="tel:%28607%29%20254%203569" value="+16072543569" target="_blank">(607) 254 3569
Sol Genomics Network
http://solgenomics.net/
[hidden email]


------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Error running load_ncbi_taxonomy.pl with input file

Naama Menda-3
It sounds from your description that you generated manually the input file with a number of taxons. I don't think the loader can work if you selectively pick a number of taxons. Your input file is for loading part of the NCBI taxonomy tree, so it must include a parent taxon and all the nodes when you climb the tree all the way to the leaves (it is OK to exclude a leaf, but not any of the nodes in the way.  this is because each node in the tree is stored with a right and left indices, which makes reconstructing trees very quick , since you don't have to climb up and down the tree each time you query. This means if you excluded a parent node, you will be missing one of the indices.

If you want to load a number of branches, you will need to load them separately, unless you go up to their common root. 


You are also using the -t option, for a test run, but looking at  the error message it does not look like this is the problem.

Can you try to load a single tree with the full lineage and let me know if that works? 

thanks
-Naama




Naama Menda
Boyce Thompson Institute for Plant Research
Tower Rd
Ithaca NY 14853
USA

(607) 254 3569
Sol Genomics Network
http://solgenomics.net/
[hidden email]


On Mon, Jun 3, 2013 at 7:41 AM, Nicolas Joannin <[hidden email]> wrote:
Hi Naama,

I generated the file through the NCBI Taxonomy webpage: I searched for all the taxons I wanted and downloaded their ids to a file. The input file is has one txid per line and ends with an empty line.
The script was run with like this: 
perl load_ncbi_taxonomy.pl -H host -D dbname -u username -d Pg -p password -i path/to/infile -v -t

Let me know if you need more info!

Best regards,
Nicolas



Nicolas Joannin, Ph.D.
Bioinformatics Center
Kyoto University, Uji campus, Japan



On Mon, Jun 3, 2013 at 7:27 PM, Naama Menda <[hidden email]> wrote:
Hi Nicolas 

How did you generate your input file, and which options did you use for running the script?

Naama



On Monday, June 3, 2013, Nicolas Joannin wrote:
Hello,

I have encountered an error running load_ncbi_taxonomy.pl with in input file of txids.
You'll find the last few lines of the output below.

Do you have any idea what I might have done wrong?

Best regards,
Nicolas

Command line output:

### MUCH MORE BEFORE THIS ###
Found child_id 14104 (organism_id = 14132) 
walking the tree for id 14104, index count is 241
Setting left index= 242 for parent 14104

Found child_id 15654 (organism_id = 15682) 
walking the tree for id 15654, index count is 242
Setting left index= 243 for parent 15654

Setting right index= 244 for phylonode id 15654

Setting right index= 245 for phylonode id 14104

Setting right index= 246 for phylonode id 25556

Setting right index= 247 for phylonode id 25068

Updating the phylonode and phylonode_organism tables

An error occured! Rolling back! 
 DBIx::Class::Storage::DBI::__ANON__(): DBI Exception: DBD::Pg::db do failed: ERROR:  null value in column "left_idx" violates not-null constraint at load_ncbi_taxonomy.pl line 528
 
 Resetting database sequences...


Nicolas Joannin, Ph.D.
Bioinformatics Center
Kyoto University, Uji campus, Japan



--
Naama Menda
Boyce Thompson Institute for Plant Research
Tower Rd
Ithaca NY 14853
USA

<a href="tel:%28607%29%20254%203569" value="+16072543569" target="_blank">(607) 254 3569
Sol Genomics Network
http://solgenomics.net/
[hidden email]



------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Error running load_ncbi_taxonomy.pl with input file

Nicolas Joannin
Hi Naama,

Thanks again for your help!
Indeed, that was the problem. I had several unrelated branches of the tree that I wanted to load, so there were several roots.
Will load them all individually...

Best regards,
Nicolas





Nicolas Joannin, Ph.D.
Bioinformatics Center
Kyoto University, Uji campus, Japan



On Tue, Jun 4, 2013 at 2:35 AM, Naama Menda <[hidden email]> wrote:
It sounds from your description that you generated manually the input file with a number of taxons. I don't think the loader can work if you selectively pick a number of taxons. Your input file is for loading part of the NCBI taxonomy tree, so it must include a parent taxon and all the nodes when you climb the tree all the way to the leaves (it is OK to exclude a leaf, but not any of the nodes in the way.  this is because each node in the tree is stored with a right and left indices, which makes reconstructing trees very quick , since you don't have to climb up and down the tree each time you query. This means if you excluded a parent node, you will be missing one of the indices.

If you want to load a number of branches, you will need to load them separately, unless you go up to their common root. 


You are also using the -t option, for a test run, but looking at  the error message it does not look like this is the problem.

Can you try to load a single tree with the full lineage and let me know if that works? 

thanks
-Naama




Naama Menda
Boyce Thompson Institute for Plant Research
Tower Rd
Ithaca NY 14853
USA

<a href="tel:%28607%29%20254%203569" value="+16072543569" target="_blank">(607) 254 3569
Sol Genomics Network
http://solgenomics.net/
[hidden email]


On Mon, Jun 3, 2013 at 7:41 AM, Nicolas Joannin <[hidden email]> wrote:
Hi Naama,

I generated the file through the NCBI Taxonomy webpage: I searched for all the taxons I wanted and downloaded their ids to a file. The input file is has one txid per line and ends with an empty line.
The script was run with like this: 
perl load_ncbi_taxonomy.pl -H host -D dbname -u username -d Pg -p password -i path/to/infile -v -t

Let me know if you need more info!

Best regards,
Nicolas



Nicolas Joannin, Ph.D.
Bioinformatics Center
Kyoto University, Uji campus, Japan



On Mon, Jun 3, 2013 at 7:27 PM, Naama Menda <[hidden email]> wrote:
Hi Nicolas 

How did you generate your input file, and which options did you use for running the script?

Naama



On Monday, June 3, 2013, Nicolas Joannin wrote:
Hello,

I have encountered an error running load_ncbi_taxonomy.pl with in input file of txids.
You'll find the last few lines of the output below.

Do you have any idea what I might have done wrong?

Best regards,
Nicolas

Command line output:

### MUCH MORE BEFORE THIS ###
Found child_id 14104 (organism_id = 14132) 
walking the tree for id 14104, index count is 241
Setting left index= 242 for parent 14104

Found child_id 15654 (organism_id = 15682) 
walking the tree for id 15654, index count is 242
Setting left index= 243 for parent 15654

Setting right index= 244 for phylonode id 15654

Setting right index= 245 for phylonode id 14104

Setting right index= 246 for phylonode id 25556

Setting right index= 247 for phylonode id 25068

Updating the phylonode and phylonode_organism tables

An error occured! Rolling back! 
 DBIx::Class::Storage::DBI::__ANON__(): DBI Exception: DBD::Pg::db do failed: ERROR:  null value in column "left_idx" violates not-null constraint at load_ncbi_taxonomy.pl line 528
 
 Resetting database sequences...


Nicolas Joannin, Ph.D.
Bioinformatics Center
Kyoto University, Uji campus, Japan



--
Naama Menda
Boyce Thompson Institute for Plant Research
Tower Rd
Ithaca NY 14853
USA

<a href="tel:%28607%29%20254%203569" value="+16072543569" target="_blank">(607) 254 3569
Sol Genomics Network
http://solgenomics.net/
[hidden email]




------------------------------------------------------------------------------
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema