-
Notifications
You must be signed in to change notification settings - Fork 0
/
Python_05_answers
156 lines (97 loc) · 3.14 KB
/
Python_05_answers
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
1.
Write a script to do the following to Python_05.txt
Open and read the contents.
Uppercase each line
Print each line to the STDOUT
#!/usr/bin/env python3
#Python_05 question 1
file = open("../python_05.txt", "r")
contents = file.read()
print (contents)
song_upper = contents.upper()
print (song_upper)
file.close()
2.
Modifiy the script in the previous problem to write the contents to a new file called "Python_05_uc.txt"
#!/usr/bin/env python3
#Python_05 question 2
file = open("../python_05.txt", "r")
file_write = open ("../Python_05_uc.txt", "w")
contents = file.read()
print (contents)
song_upper = contents.upper()
print (song_upper)
file_write.write(song_upper)
file.close()
file_write.close()
3. Open and print the reverse complement of each sequence in Python_05.fasta.
Make sure to print the output in fasta format including the sequence name and a note in the description that this is the reverse complement.
Print to STDOUT and capture the output into a file with a command line redirect '>'.
wget --no-check-certificate https://raw.githubusercontent.com/srobb1/pfb2017/master/files/Python_05.fasta
(check text file for full notes)
#!/usr/bin/env python3
#Python_05 question 3
fasta_read = open ("../Python_05.fasta", "r")
fasta_write = open ("../Python_05comb.fasta", "w")
combined = ""
for dna in fasta_read:
dna = dna.rstrip()
if dna.startswith('>'):
if combined:
fasta_write.write(combined + "\n")
combined = ""
fasta_write.write(dna + "\n")
else:
combined += dna
fasta_write.write(combined + "\n")
fasta_read.close()
fasta_write.close()
#dna = dna.rstrip()
#dna_reverse= dna.replace('T', 'a').replace('A', 't').replace('G', 'c').replace('C', 'g')
#dna_reverse = dna_reverse.rstrip()
#fasta_write.write(dna_reverse[::-1])
#print ("../Python_05_rc.fasta")
4.
Open the FASTQ file Python_05.fastq and go through each line of the file. Count the number of lines and the number of characters per line. Have your program report the:
total number of lines
total number of characters
average line length
#!/usr/bin/env python3
#Python_05 question 4
linecount = 0
chrcount = 0
fastq = open ("../Python_05.fastq", "r")
for line in fastq:
linecount+= 1
chrcount += len(line)
print ("linecount:", linecount)
print ("chrcount:", chrcount )
print ("avglinelength:", (chrcount/linecount) )
pfb14:files admin$ python3 myscript_19.py
linecount: 120
chrcount: 7920
avglinelength: 66.0
***** FORGOT that \n is a character
#!/usr/bin/env python3
#Python_05 question 4
linecount = 0
chrcount = 0
fastq = open ("../Python_05.fastq", "r")
for line in fastq:
line = line.rstrip("\n")
linecount+= 1
chrcount += len(line)
print ("linecount:", linecount)
print ("chrcount:", chrcount )
print ("avglinelength:", (chrcount/linecount) )
linecount: 120
chrcount: 7800
avglinelength: 65.0
5. Genelists from Ensembl Biomart
pfb14:files admin$ python3 myscript_20.py
not_sc_count: 15147
ENSVPAG00000000048
ENSVPAG00000006160
ENSVPAG00000003148
ENSVPAG00000006812
both_scandpigm_count: 5