Version 3 (modified by heikki, 17 years ago) |
---|
BioPerl Round Trip , First Pass
Initial evaluation of sequence format roundtrips: flat file -> BioPerl object -> same flat file format
Based on bioperl-live SVN revision 14501.
BioPerl does not have parsers for these formats:
- ASN.1
- genbank XML
- INSD XML
fasta
- minor: the length of the sequence line can vary (settable within Bio::SeqIO::fasta object)
embl
- MAJOR: sequence name and accession lost in conversion
- MAJOR: OX line for TaxId is lost
- minor: the feature key/value pairs (FT) are not returned in order
- minor: SQ line does not contain CRC32 value
genbank
- MAJOR: SOURCE line adds full stop to the end of the line (following old genbank conversion?)
- minor: line BASE not present in recent genbank file, still generated by bioperl
- minor: features are not returned in order
swiss-prot
- minor: No full stop at the end of the DT lines
- MAJOR: GN line returning only value from key/value pairs (e.g.
GN Name=DOF3.7; Synonyms=BBFA, DAG1;... -> GN DOF3.7 OR BBFA OR DAG1 ...
- minor: OC line word wrapping differences
- minor: extra spaces at the end of the first RT line when there are more than one of them
- MAJOR: RX line:DOI key/value pair lost
- MAJOR: PE (evidence) line returned between CC and DR lines when it should be between DR and KW lines
- minor: extra space after first FT line
- minor: FTid sometimes not written on its own line
- minor: extra space written to the end of the sequence line