BED Files

BED files are used to store annotations or data values in a simple text format.


BED files require at least three columns, with an optional column as a descriptor or gene name.

column1: chromosome number

column2: start position

column3: stop position

column4: gene name or identifier


Note: columns in a BED file are always tab-delimited


Let’s examine the cancer_genes.bed file with head

head cancer_genes.bed


chr1    2975604     3345045        PRDM16

chr1    17217812    17253252       SDHB

chr1    18830087    18947946       PAX7

Often it will be necessary to extract a subset of columns from a BED file to produce another file


Use the cut command to extract the gene identifier column(4) and make a new file with the gene names

cut -f 4 cancer_genes.bed > genes.txt

Examine the first lines of the genes.txt, to confirm that the 4th column was extracted

more genes.txt

The genes are out of order, let’s sort them alphabetically using the sort command

sort genes.txt

Let’s sort them in reverse alphabetical order

sort -r genes.txt

Now let’s find all the genes in that contain the string ‘RAS ‘

grep RAS genes.txt

How many genes contain the word RAS ?

Now let’s return to the cancer_genes.bed file and use the head and cut commands (no it’s not a guillotine) to extract the first 10 lines from column 1 and output a file called column1.txt

grep RAS genes.txt | wc -l

head -10 cancer_genes.bed | cut -f 1 > column1.txt

Now let’s use head and cut to extract the first 10 lines from column 3 and output a file called column3.txt

head -10 cancer_genes.bed | cut -f 3 > column3.txt

paste column1.txt column3.txt > join_columns.txt

Use the more command to examine the contents of the join_columns.txt file

more join_columns.txt

The counterpart of the cut command is the paste command which can be used to paste columns together


We will use the paste command to stitch together columns 1 and 3 and make a new file called join_columns.txt

Alternatively we can merge the columns vertically using the cat command

cat column1.txt column3.txt > vertical_columns.txt

more vertical_columns.txt

That concludes our section on working with BED files