Plotting a long string of (x,y) values as a chromosome

Hello everyone! I want to plot a huge CSV file with around ~100k rows and 2 columns. The first column refers to (ascending) coordinates on a chromosome, and the second column is either -1 or 1 (from grandpa or from grandma). Below is an example of a few rows.
14932 -1
14937 -1
15015 -1
16257 1
16487 1
20191 -1
Do y'all have any recommendations on how to plot this? I tried to do this as a scatterplot, with the coordinates (first column) as x, and the second column as y, but it does not turn out well as there are >100k rows.
It will be great if I can represent the chromosome as a "line" but with the portions from grandpa (1) and grandma (-1) marked out. Thank you so much!

What information are you trying to convey with the plot? Are you trying to present a summary or show the actual raw data in some meaningful way?

It will be great if I can represent the chromosome as a "line" but with the portions from grandpa (1) and grandma (-1) marked out.

Can you provide some more details about what you think this might look like? I immediately thought about chunking the genome into sliding windows and computing the proportion of each window being from grandma or grandpa and plotting that. Not sure if that is what you mean.

Thank you so much for your response! I am trying to illustrate the "chunks" of grandpa and grandma portions, so as to show the recombination breakpoints. I think your sliding window think might work! May I ask if you know any suggested functions or packages for that?

I use rsample::sliding_window() or rsample::sliding_index() for these purposes.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.