| 1 | = !BioJava = |
| 2 | Working with !BioJava-live SVN build 4716 |
| 3 | |
| 4 | == Fasta Format == |
| 5 | === Major Issues === |
| 6 | None |
| 7 | === Minor Issues === |
| 8 | Sequence case is not preserved. Line length varies (default is 80 cpl). |
| 9 | |
| 10 | |
| 11 | == Genbank format == |
| 12 | === Major Issues === |
| 13 | None. |
| 14 | === Minor Issues === |
| 15 | Feature qualifier order is not preserved. |
| 16 | Because NCBI Taxonomy is referenced from memory or database if the version used |
| 17 | doesn't match the version that was used to construct the record then minor |
| 18 | differences appear. For example the common name of Arabidopsis changed from thale cress |
| 19 | to mouse ear cress. |
| 20 | |
| 21 | == GenbankXML format == |
| 22 | == Major Issues == |
| 23 | Not supported (INSD is). |
| 24 | |
| 25 | == INSD Format == |
| 26 | == Major Issues == |
| 27 | |
| 28 | == Minor Issues == |
| 29 | Biojava doesn't add the XML header and INSDSet tags, starts with INSDSeq. |
| 30 | {{{ |
| 31 | <?xml version="1.0"?> |
| 32 | <!DOCTYPE INSDSet PUBLIC "-//NCBI//INSD INSDSeq/EN" "http://www.ncbi.nlm.nih.gov/dtd/INSD_INSDSeq.dtd"> |
| 33 | <INSDSet> |
| 34 | <INSDSeq> |
| 35 | ... |
| 36 | }}} |
| 37 | |
| 38 | * !BioJava inlcudes the Reference_position tag. NCBI doesn't unless it is not 1..1 |
| 39 | {{{ |
| 40 | <INSDReference_position>1..1</INSDReference_position> |
| 41 | }}} |
| 42 | |
| 43 | * There are other examples of this redundancy. I think if this doesn't break the |
| 44 | dtd then it doesn't matter. |
| 45 | |
| 46 | * Qualifiers order is not preserved. I don't think this matters. |