| 1 | This is what I understood from our discussions about data formats (I'll just talk about Bio::EMBL, but the same applies to Bio:GenBank, ...): |
| 2 | |
| 3 | = Creating Bio::Sequence objects from an EMBL file = |
| 4 | |
| 5 | * The Bio::EMBL object should basically just create a rich Bio::Sequence object and _not_ store any information in a Bio::EMBL object. |
| 6 | * To make it possible that a researcher can call methods in an EMBL-specific way (e.g. saying ''my_seq.cc'' instead of ''my_seq.comments''), we will try to do the following: If a user types ''my_seq.embl.cc'', a Bio::Embl object is created that holds a reference to the original Bio::Sequence object and the ''cc'' method of which is redirected to the Bio::Sequence's ''comments'' method. |
| 7 | |
| 8 | = Creating an EMBL file (well, the string) from a Bio::Sequence = |
| 9 | |
| 10 | * To write an EMBL-formatted sequence, the Bio::Sequence#output method is rewritten to use an ERB template that is stored in the /lib/bio/sequence/formats/ directory. The existing Bio::Format object is bypassed completely and we have to check if it can be removed. The new Bio::Sequence#output method now looks like this: |
| 11 | {{{ |
| 12 | def output(format = :fasta) |
| 13 | record_template = ERB.new(File.read("./sequence/formats/#{format.to_s}.erb")) |
| 14 | record_template.result(binding) |
| 15 | end |
| 16 | }}} |
| 17 | This means that we will also have to write this template for the FASTA format. |