I want to color each character conditionaly in every cell of column.
I can imagine what to do partialy (newbie in R):
1. open xlsx table or txt and change it to xlsx
2. iterate through column (threat cell as vector)
3. iterate through each vector (characters) and change color conditionally
(and through regex find lines which will be colored - sequences)
4. save to xlsx
But I do not know how to color items in xlsx (and which lib) and how to save file with this change.
Sample data:
>>f_2;hypothetical protein L_2128 [Legionella] {gene:L_2128}_start=1;end=300;length=300;source_length=320
LAKELTYTDIINLKDSGLISNSEALCSIDFSERNSCTLINCKKLIIIEASQESSKIQLSILPFTKAGTELLAFTNPTSNNEYIMKLCNLVKASKARIHVADIEKIVGDKISYKNKNVISG
&~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5 | -0.368E+01 >>_vfdb.0002001_ VFG001328(gi:21283614) (sak) Staphylokinase precursor [Staphylokinase (VF0021)] [Staphylococcus aureus subsp. aureus MW2] :_: Length: 163
ss1 HHHHHHHHHEEEETTCCCCCCCHHHEEEHTTTTTTTH-HHHHHHEEEEHHHHHHTHEEEEEECTCCCCHHHEEECCCCCCTTEHHHHHHHHHHH
#1 LAKELTYTDIINLKDSGLISNSEALCSIDFSERNSCT-LINCKKLIIIEASQESSKIQLSILPFTKAGTELLAFTNPTSNNEYIMKLCNLVKAS
#c ----------------------+---------------+----+-----------+------+-+--+-----------+--------------
#2 VEFPIKPGTTLTKEK--IEYYVEWALDATAYKEFRVVELDTSAKIEVTYYDKNKKKEETKSFPITEKGFVVPDLSEHIKNPGFNLITKVVIEKK
ss2 EEETTCCTCCCHHHH--HHHHHHHHHHHHHHHHHHHHHHHHHHHHEEHHHHHHHHHHHHHHCHHHTTTEECHHHHHTTCCTTTCEEEHHHHHHH
pseudoscore: 8.51
1st sequence starts at 1
2nd sequence starts at 72
>>f_1; hypothetical protein L_2128 [Legionella] {gene:L_2128}_start=201;end=320;length=120;source_length=320
LAKELTYTDIINLKDSGLISNSEALCSIDFSERNSCTLINCKKLIIIEASQESSKIQLSILPFTKAGTELLAFTNPTSNNEYIMKLCNLVKASKARIHVADIEKIVGDKISYKNKNVISG
&~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5 | -0.368E+01 >>_vfdb.0002001_ VFG001328(gi:21283614) (sak) Staphylokinase precursor [Staphylokinase (VF0021)] [Staphylococcus aureus subsp. aureus MW2] :_: Length: 163
ss1 HHHHHHHHHEEEETTCCCCCCCHHHEEEHTTTTTTTH-HHHHHHEEEEHHHHHHTHEEEEEECTCCCCHHHEEECCCCCCTTEHHHHHHHHHHH
#1 LAKELTYTDIINLKDSGLISNSEALCSIDFSERNSCT-LINCKKLIIIEASQESSKIQLSILPFTKAGTELLAFTNPTSNNEYIMKLCNLVKAS
#c ----------------------+---------------+----+-----------+------+-+--+-----------+--------------
#2 VEFPIKPGTTLTKEK--IEYYVEWALDATAYKEFRVVELDTSAKIEVTYYDKNKKKEETKSFPITEKGFVVPDLSEHIKNPGFNLITKVVIEKK
ss2 EEETTCCTCCCHHHH--HHHHHHHHHHHHHHHHHHHHHHHHHHHHEEHHHHHHHHHHHHHHCHHHTTTEECHHHHHTTCCTTTCEEEHHHHHHH
pseudoscore: 8.51
1st sequence starts at 1
2nd sequence starts at 72
My code:
# xlsx files
setwd('D:/Dropbox/color_ffas_results')
library(xlsx)
wb <- loadWorkbook("sample.xlsx")
sheet1 <- getSheets(wb)[[1]]
# get all rows
rows <- getRows(sheet1)
cells <- getCells(rows)
# look at the values
sapply(cells, getCellValue)
cellColor <- function(style) {
SET COLOR HERE
}
#sequence_pattern <- str_detect("^#\d .*\n")