loading interpro.xml

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

loading interpro.xml

joe carlson
Hello,

Don’t you just hate it when code that you’ve been using for a year or so suddenly breaks?

I’ve just updated my version of interpro.xml (to 53.0) and am using your vanilla interpro data loader. I’m getting error messages because there are non-UTF8 characters:

[integrate] Caused by: java.sql.SQLException: Error writing to database, running statement COPY Attribute (name, intermine_value, itemId) FROM STDIN;                                                                                                             
[integrate] , data size = 2682870                                                                                                
[integrate]     at org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:56)                 
[integrate]     at org.intermine.sql.writebatch.Batch$BatchFlusher.run(Batch.java:461)                                           
[integrate]     at java.lang.Thread.run(Thread.java:745)                                                                         
[integrate] Caused by: org.postgresql.util.PSQLException: ERROR: invalid byte sequence for encoding "UTF8": 0xfd                 
[integrate]   Where: COPY attribute, line 42945                                                                                  
[integrate]     at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2161)                    
[integrate]     at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:966)                       
[integrate]     at org.postgresql.core.v3.QueryExecutorImpl.endCopy(QueryExecutorImpl.java:828)                                  
[integrate]     at org.postgresql.core.v3.CopyInImpl.endCopy(CopyInImpl.java:59)                                                 
[integrate]     at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:181)                                                  
[integrate]     at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:161)                                                  
[integrate]     at org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:51)                 
[integrate]     ... 2 more                                                                                                       
     [null] Exiting /global/u1/j/jcarlson/src/intermine/bio/sources/interpro/build.xml.                     

The header of interpro.xml says it has encoding ISO-8859-1. I’ve looked at it and it does have some naughty characters in it 

My server encoding is UTF8. I’ve tried playing around with iconv to make the xml UTF8, but nothing has helped (so far). I’d rather not resort to hand edits.

Which version of interpro.xml do you load in flymine? Did you have this problem? Or, should I open a ticket?

Thanks,

joe

_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: loading interpro.xml

HongKee Moon
Hi Joe,

We can think of this problem in two ways.
One is your local setting and the other is postgres client setting.

Firstly, most of client applications depend on user environment setting.
So, I am wondering what is your locale setting in your postgres/intermine machine.
Recently, I have noticed that when the LC_ALL=en_US.UTF-8 is not set, JAVA does not care UTF-8 encoding anymore.

The other option is to change the postgres client setting. 
In this case, please follow the link(http://www.postgresql.org/message-id/4B79EE13.1030906@...).

Hopefully, it will work for you.

Cheers,
HongKee

On Dec 10, 2015, at 6:16 AM, Joe Carlson <[hidden email]> wrote:

Hello,

Don’t you just hate it when code that you’ve been using for a year or so suddenly breaks?

I’ve just updated my version of interpro.xml (to 53.0) and am using your vanilla interpro data loader. I’m getting error messages because there are non-UTF8 characters:

[integrate] Caused by: java.sql.SQLException: Error writing to database, running statement COPY Attribute (name, intermine_value, itemId) FROM STDIN;                                                                                                             
[integrate] , data size = 2682870                                                                                                
[integrate]     at org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:56)                 
[integrate]     at org.intermine.sql.writebatch.Batch$BatchFlusher.run(Batch.java:461)                                           
[integrate]     at java.lang.Thread.run(Thread.java:745)                                                                         
[integrate] Caused by: org.postgresql.util.PSQLException: ERROR: invalid byte sequence for encoding "UTF8": 0xfd                 
[integrate]   Where: COPY attribute, line 42945                                                                                  
[integrate]     at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2161)                    
[integrate]     at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:966)                       
[integrate]     at org.postgresql.core.v3.QueryExecutorImpl.endCopy(QueryExecutorImpl.java:828)                                  
[integrate]     at org.postgresql.core.v3.CopyInImpl.endCopy(CopyInImpl.java:59)                                                 
[integrate]     at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:181)                                                  
[integrate]     at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:161)                                                  
[integrate]     at org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:51)                 
[integrate]     ... 2 more                                                                                                       
     [null] Exiting /global/u1/j/jcarlson/src/intermine/bio/sources/interpro/build.xml.                     

The header of interpro.xml says it has encoding ISO-8859-1. I’ve looked at it and it does have some naughty characters in it 

My server encoding is UTF8. I’ve tried playing around with iconv to make the xml UTF8, but nothing has helped (so far). I’d rather not resort to hand edits.

Which version of interpro.xml do you load in flymine? Did you have this problem? Or, should I open a ticket?

Thanks,

joe
_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev


--
HongKee Moon
Software Engineer
Scientific Computing Facility

Max Planck Institute of Molecular Cell Biology and Genetics
Pfotenhauerstr. 108
01307 Dresden
Germany

fon: +49 351 210 2740
fax: +49 351 210 1689


_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

loading interpro.xml (Solved)

joe carlson
In reply to this post by joe carlson

Nevermind.

Sorry. This was an error in my codebase.

We were trying out an alternate db server that was alledged to be a postgres compliant one the other week. It was not, and I was trying a couple of tweaks to make it work before I gave up. But I forgot to rollback my changes.

Joe

-------- Forwarded Message --------
Subject: loading interpro.xml
Date: Wed, 9 Dec 2015 21:16:53 -0800
From: Joe Carlson [hidden email]
To: [hidden email]


Hello,

Don’t you just hate it when code that you’ve been using for a year or so suddenly breaks?

I’ve just updated my version of interpro.xml (to 53.0) and am using your vanilla interpro data loader. I’m getting error messages because there are non-UTF8 characters:

[integrate] Caused by: java.sql.SQLException: Error writing to database, running statement COPY Attribute (name, intermine_value, itemId) FROM STDIN;                                                                                                             
[integrate] , data size = 2682870                                                                                                
[integrate]     at org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:56)                 
[integrate]     at org.intermine.sql.writebatch.Batch$BatchFlusher.run(Batch.java:461)                                           
[integrate]     at java.lang.Thread.run(Thread.java:745)                                                                         
[integrate] Caused by: org.postgresql.util.PSQLException: ERROR: invalid byte sequence for encoding "UTF8": 0xfd                 
[integrate]   Where: COPY attribute, line 42945                                                                                  
[integrate]     at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2161)                    
[integrate]     at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:966)                       
[integrate]     at org.postgresql.core.v3.QueryExecutorImpl.endCopy(QueryExecutorImpl.java:828)                                  
[integrate]     at org.postgresql.core.v3.CopyInImpl.endCopy(CopyInImpl.java:59)                                                 
[integrate]     at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:181)                                                  
[integrate]     at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:161)                                                  
[integrate]     at org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:51)                 
[integrate]     ... 2 more                                                                                                       
     [null] Exiting /global/u1/j/jcarlson/src/intermine/bio/sources/interpro/build.xml.                     

The header of interpro.xml says it has encoding ISO-8859-1. I’ve looked at it and it does have some naughty characters in it 

My server encoding is UTF8. I’ve tried playing around with iconv to make the xml UTF8, but nothing has helped (so far). I’d rather not resort to hand edits.

Which version of interpro.xml do you load in flymine? Did you have this problem? Or, should I open a ticket?

Thanks,

joe



_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev