Saturday, May 9, 2009

16) LAPIS: Fasta formatting a list of sequences

Let's say you have a list of sequences (one sequence per line) downloaded from a databases and you would like to FASTA format the list. To quickly do this, follow the steps below.

1. Copy and paste the list to an excel sheet
2. Insert a new column before the sequence list column
3. In the new column enter number 1 in cell 1 and number 2 in cell 2, and then to auto fill the numbers in the rest of the cells, you highlight cell 1 and 2 and click-drag the right bottom corner of the second cell to the last cell of the list
4. Copy the two columns to LAPIS as a new file
5. Activate the simultaneous editing mode
6. On the first line, click in between the description portion and the sequence portion of the line. E.g. place and click the cursor after the number "1", which is just before the sequence
7. Hit the enter button. This will move the sequence portion below the description portion in a new line. Thanks to the simultaneous mode, this will be automatically done for the rest of the sequences
8. There might be extra leading spaces before the sequence. So, to remove them, click at the beginning of any one of the lines containing the sequence. Then, click on the delete button until all the leading spaces are gone. Thanks to the simultaneous editing mode, all the leading spaces in all the other sequences will also be removed.
9. Save the file as a text file. You are done


Posted by: Asif Khan

No comments:

Post a Comment