Perl vs PHP for data loaders

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Perl vs PHP for data loaders

Cannon, Ethalinda K [COM S]

Hi Tripal developers,


I'm curious how many groups develop scripts in Perl or other languages to load data rather than using or writing loaders within Tripal modules (custom, extensions, or core) or using the Bulk Loader. This is to get a sense for how comfortable curators and site administrators are with using loading tools that are not built in to modules. 


I am developing a set of loaders that will be interactive, for example, which will permit the user to confirm/ignore/deny questionable data detected by the loader. This appears to be a daunting task within Drupal.


Ethy



------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal
Reply | Threaded
Open this post in threaded view
|

Re: Perl vs PHP for data loaders

Stephen Ficklin-2
Hi Ethy,

I am aware of folks who do use Perl to create loaders mostly I think because it's familiar to them and Drupal is a bit more challenging.  In our original design of the bulk loader we did have the idea to have something that would allow folks to confirm/ignore/deny questionable data but that was never formalized.  But I suspect the reason you are writing custom loaders is that the bulk loader isn't quite appropriate for your data? 

In the new section titled 'Sharing Your Custom Modules' in the Tripal Users Guide (http://tripal.info/node/253) that I just added this week I did recommend that folks avoid writing Perl, Python or other components that interact with Drupal.  The side effect is more complex management for the site (perl/python dependencies, more systems administration type responsibilities).   But, at the same time we don't want to have burdens to adoption for Tripal so maybe we talk more about use of Perl/Python/etc.    What is it that makes creating this type of interactive loader seem daunting in Drupal?

Thanks for raising the topic,
Stephen

On 12/3/2015 9:08 AM, Cannon, Ethalinda K [COM S] wrote:

Hi Tripal developers,


I'm curious how many groups develop scripts in Perl or other languages to load data rather than using or writing loaders within Tripal modules (custom, extensions, or core) or using the Bulk Loader. This is to get a sense for how comfortable curators and site administrators are with using loading tools that are not built in to modules. 


I am developing a set of loaders that will be interactive, for example, which will permit the user to confirm/ignore/deny questionable data detected by the loader. This appears to be a daunting task within Drupal.


Ethy




------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140


_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal


------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal
Reply | Threaded
Open this post in threaded view
|

Re: Perl vs PHP for data loaders

Cannon, Ethalinda K [COM S]

Part of the decision behind using Perl for the QTL/Map/Marker loaders is expediency. It's also partly to allow non-Tripal websites to use them, and partly because of the amount of interaction with the user. We would like to implement them in Tripal eventually,if feasible.


The loaders have a two-step process: verify then load. If there are errors, the user can skip the faulty record or stop the process. If there are suspicious records, the user can decide if the record should be loaded anyway, skipped, or to stop the process. The scripts also support modifying existing records and ask permission before doing so. I don't see an easy way to implement this sort of interactive process (it's not just one simple yes or no response) in Drupal.


I'll also note that error checking ranges from simply insuring that required fields are set, to verifying existence of cvterms and insuring that start coords are smaller than end coords, to fairly complex conditional checks (e.g. at least one of fields A, B, or C must be set; if field D is set then field E must also be set, otherwise neither should be set; the publication must be in the database or described in the spread sheet). I think it would be difficult to handle all possible data checks in a general way.

The reason for building in so much checking is that the data is very complex and errors are easier to catch and fix at the loading phase than after it's been loaded.

Ethy

From: Stephen Ficklin <[hidden email]>
Sent: Thursday, December 03, 2015 12:08 PM
To: [hidden email]
Subject: Re: [Gmod-tripal] Perl vs PHP for data loaders
 
Hi Ethy,

I am aware of folks who do use Perl to create loaders mostly I think because it's familiar to them and Drupal is a bit more challenging.  In our original design of the bulk loader we did have the idea to have something that would allow folks to confirm/ignore/deny questionable data but that was never formalized.  But I suspect the reason you are writing custom loaders is that the bulk loader isn't quite appropriate for your data? 

In the new section titled 'Sharing Your Custom Modules' in the Tripal Users Guide (http://tripal.info/node/253) that I just added this week I did recommend that folks avoid writing Perl, Python or other components that interact with Drupal.  The side effect is more complex management for the site (perl/python dependencies, more systems administration type responsibilities).   But, at the same time we don't want to have burdens to adoption for Tripal so maybe we talk more about use of Perl/Python/etc.    What is it that makes creating this type of interactive loader seem daunting in Drupal?

Thanks for raising the topic,
Stephen

On 12/3/2015 9:08 AM, Cannon, Ethalinda K [COM S] wrote:

Hi Tripal developers,


I'm curious how many groups develop scripts in Perl or other languages to load data rather than using or writing loaders within Tripal modules (custom, extensions, or core) or using the Bulk Loader. This is to get a sense for how comfortable curators and site administrators are with using loading tools that are not built in to modules. 


I am developing a set of loaders that will be interactive, for example, which will permit the user to confirm/ignore/deny questionable data detected by the loader. This appears to be a daunting task within Drupal.


Ethy




------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140


_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal


------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal
Reply | Threaded
Open this post in threaded view
|

Re: Perl vs PHP for data loaders

Stephen Ficklin-2
Hi Ethy,

I completely understand expediency.  Drupal/PHP does have a bit of a learning curve. If you intend to share the loaders later, it would make management and long-term maintenance much easier for others if the loaders were part of your module in PHP (no perl dependencies, no extra setup, it just works when the module is enabled).  Plus if you do decide later you want to provide that convenience to those who use your module then you don't have to rewrite the loaders in PHP.   So, despite the slow start, the long-term benefits may justify the extra effort writing them in PHP with a web-based interface?

I wonder if it would help if we included some design and coding help in our monthly meetings for specific projects like this?   We're all at different levels in terms of understanding how to use the Drupal API and hooks, so perhaps some input on design and coding help may be of use for folks? 

Stephen

On 12/3/2015 10:36 AM, Cannon, Ethalinda K [COM S] wrote:

Part of the decision behind using Perl for the QTL/Map/Marker loaders is expediency. It's also partly to allow non-Tripal websites to use them, and partly because of the amount of interaction with the user. We would like to implement them in Tripal eventually,if feasible.


The loaders have a two-step process: verify then load. If there are errors, the user can skip the faulty record or stop the process. If there are suspicious records, the user can decide if the record should be loaded anyway, skipped, or to stop the process. The scripts also support modifying existing records and ask permission before doing so. I don't see an easy way to implement this sort of interactive process (it's not just one simple yes or no response) in Drupal.


I'll also note that error checking ranges from simply insuring that required fields are set, to verifying existence of cvterms and insuring that start coords are smaller than end coords, to fairly complex conditional checks (e.g. at least one of fields A, B, or C must be set; if field D is set then field E must also be set, otherwise neither should be set; the publication must be in the database or described in the spread sheet). I think it would be difficult to handle all possible data checks in a general way.

The reason for building in so much checking is that the data is very complex and errors are easier to catch and fix at the loading phase than after it's been loaded.

Ethy

From: Stephen Ficklin [hidden email]
Sent: Thursday, December 03, 2015 12:08 PM
To: [hidden email]
Subject: Re: [Gmod-tripal] Perl vs PHP for data loaders
 
Hi Ethy,

I am aware of folks who do use Perl to create loaders mostly I think because it's familiar to them and Drupal is a bit more challenging.  In our original design of the bulk loader we did have the idea to have something that would allow folks to confirm/ignore/deny questionable data but that was never formalized.  But I suspect the reason you are writing custom loaders is that the bulk loader isn't quite appropriate for your data? 

In the new section titled 'Sharing Your Custom Modules' in the Tripal Users Guide (http://tripal.info/node/253) that I just added this week I did recommend that folks avoid writing Perl, Python or other components that interact with Drupal.  The side effect is more complex management for the site (perl/python dependencies, more systems administration type responsibilities).   But, at the same time we don't want to have burdens to adoption for Tripal so maybe we talk more about use of Perl/Python/etc.    What is it that makes creating this type of interactive loader seem daunting in Drupal?

Thanks for raising the topic,
Stephen

On 12/3/2015 9:08 AM, Cannon, Ethalinda K [COM S] wrote:

Hi Tripal developers,


I'm curious how many groups develop scripts in Perl or other languages to load data rather than using or writing loaders within Tripal modules (custom, extensions, or core) or using the Bulk Loader. This is to get a sense for how comfortable curators and site administrators are with using loading tools that are not built in to modules. 


I am developing a set of loaders that will be interactive, for example, which will permit the user to confirm/ignore/deny questionable data detected by the loader. This appears to be a daunting task within Drupal.


Ethy




------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140


_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal



------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal
Reply | Threaded
Open this post in threaded view
|

Re: Perl vs PHP for data loaders

Cannon, Ethalinda K [COM S]

Hi Stephen,


The plan is to eventually implement the loaders in the QTL/Map/Marker Tripal module, but that may be a ways out. But the more I think about it, the more I want Perl loaders too, because I'd like to use the same data mapping for at least one and maybe two other websites that use Chado but don't use Tripal. In the long run, though, we obviously don't want to maintain two loaders and I expect that the Tripal loader will become the loader of choice.

Some development tutorials during the Tripal developers call would be helpful, in addition to the tutorials and developer documentation that already exist at tripal.info (which I have consulted frequently). Perhaps a single specific issue at a time, like writing a loader, could be addressed when there is an open agenda. 

Ethy

From: Stephen Ficklin <[hidden email]>
Sent: Monday, December 07, 2015 2:13 PM
To: Cannon, Ethalinda K [COM S]; [hidden email]
Subject: Re: [Gmod-tripal] Perl vs PHP for data loaders
 
Hi Ethy,

I completely understand expediency.  Drupal/PHP does have a bit of a learning curve. If you intend to share the loaders later, it would make management and long-term maintenance much easier for others if the loaders were part of your module in PHP (no perl dependencies, no extra setup, it just works when the module is enabled).  Plus if you do decide later you want to provide that convenience to those who use your module then you don't have to rewrite the loaders in PHP.   So, despite the slow start, the long-term benefits may justify the extra effort writing them in PHP with a web-based interface?

I wonder if it would help if we included some design and coding help in our monthly meetings for specific projects like this?   We're all at different levels in terms of understanding how to use the Drupal API and hooks, so perhaps some input on design and coding help may be of use for folks? 

Stephen

On 12/3/2015 10:36 AM, Cannon, Ethalinda K [COM S] wrote:

Part of the decision behind using Perl for the QTL/Map/Marker loaders is expediency. It's also partly to allow non-Tripal websites to use them, and partly because of the amount of interaction with the user. We would like to implement them in Tripal eventually,if feasible.


The loaders have a two-step process: verify then load. If there are errors, the user can skip the faulty record or stop the process. If there are suspicious records, the user can decide if the record should be loaded anyway, skipped, or to stop the process. The scripts also support modifying existing records and ask permission before doing so. I don't see an easy way to implement this sort of interactive process (it's not just one simple yes or no response) in Drupal.


I'll also note that error checking ranges from simply insuring that required fields are set, to verifying existence of cvterms and insuring that start coords are smaller than end coords, to fairly complex conditional checks (e.g. at least one of fields A, B, or C must be set; if field D is set then field E must also be set, otherwise neither should be set; the publication must be in the database or described in the spread sheet). I think it would be difficult to handle all possible data checks in a general way.

The reason for building in so much checking is that the data is very complex and errors are easier to catch and fix at the loading phase than after it's been loaded.

Ethy

From: Stephen Ficklin [hidden email]
Sent: Thursday, December 03, 2015 12:08 PM
To: [hidden email]
Subject: Re: [Gmod-tripal] Perl vs PHP for data loaders
 
Hi Ethy,

I am aware of folks who do use Perl to create loaders mostly I think because it's familiar to them and Drupal is a bit more challenging.  In our original design of the bulk loader we did have the idea to have something that would allow folks to confirm/ignore/deny questionable data but that was never formalized.  But I suspect the reason you are writing custom loaders is that the bulk loader isn't quite appropriate for your data? 

In the new section titled 'Sharing Your Custom Modules' in the Tripal Users Guide (http://tripal.info/node/253) that I just added this week I did recommend that folks avoid writing Perl, Python or other components that interact with Drupal.  The side effect is more complex management for the site (perl/python dependencies, more systems administration type responsibilities).   But, at the same time we don't want to have burdens to adoption for Tripal so maybe we talk more about use of Perl/Python/etc.    What is it that makes creating this type of interactive loader seem daunting in Drupal?

Thanks for raising the topic,
Stephen

On 12/3/2015 9:08 AM, Cannon, Ethalinda K [COM S] wrote:

Hi Tripal developers,


I'm curious how many groups develop scripts in Perl or other languages to load data rather than using or writing loaders within Tripal modules (custom, extensions, or core) or using the Bulk Loader. This is to get a sense for how comfortable curators and site administrators are with using loading tools that are not built in to modules. 


I am developing a set of loaders that will be interactive, for example, which will permit the user to confirm/ignore/deny questionable data detected by the loader. This appears to be a daunting task within Drupal.


Ethy




------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140


_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal



------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal