Pages

Sunday 24 July 2016

RNA Arrays

Get RNA Arrays annotation from Bioconductor

I want to get, for example, the gene symbols for each probe in a given array.

I will use the MOE430A and MOE430B chip annotations.

You can get them using the following code, after uncommenting:

#source("https://bioconductor.org/biocLite.R")
#biocLite("moe430a.db")
#biocLite("moe430b.db")

Now we load the libraries and get the data

library(moe430a.db)
library(moe430b.db)
library(dplyr)

According to the documentation we need to do the following:

mapped_probes_a.keys <- mappedkeys(moe430aREFSEQ)
mapped_probes_a.df   <- as.data.frame(moe430aREFSEQ[mapped_probes_a.keys])
mapped_probes_b.keys <- mappedkeys(moe430bREFSEQ)
mapped_probes_b.df   <- as.data.frame(moe430bREFSEQ[mapped_probes_b.keys])
mapped_probes.df     <- unique(bind_rows(mapped_probes_a.df, mapped_probes_b.df))
rm(mapped_probes_a.keys, mapped_probes_b.keys, mapped_probes_a.df, mapped_probes_b.df)

The data is in the mapped_probes.df dataframe

##      probe_id    accession
## 1  1415670_at    NM_017477
## 2  1415670_at    NM_201244
## 3  1415670_at    NP_059505
## 4  1415670_at    NP_957696
## 5  1415670_at XM_006506386
## 6  1415670_at XP_006506449
## 7  1415671_at    NM_013477
## 8  1415671_at    NP_038505
## 9  1415672_at NM_001042484
## 10 1415672_at    NM_020585

Maybe we would like to save the data frame in a .csv file

write.csv(mapped_probes.df, "mapped_probes_df.csv")

So that we can open it later

again_mapped_probes.df <- read.csv("mapped_probes_df.csv")

Done!

No comments:

Post a Comment