I am wanting to create a difference in difference model to analyse the effect of a reduction in the drink drive limit in Scotland on the number of deaths, using England as a control.
I am using the stats19 data but im finding it really difficult to arrange the data into the model as there is quite alot of it.
Welcome! I'm afraid you'll need to supply some more info in order for helpers to be able to understand your problem (this is pretty common — when you're new to this stuff, it's hard to know how much information is enough!).
The best thing would be if you can make your question into a reproducible example (follow the link for instructions and explanations). To include your data, you'll want to follow one of the methods discussed here.
If you try all that and get stuck, here's a fallback option...
Edit your post and add in some of the code you have tried. It's OK it doesn't work! It's really helpful to see what you've been attempting. Be sure to format your code as code (it's hard to read unformatted code, and it can get garbled by the forum software)
Include sample data:
If your data set is OK to share, run the following line and paste the output into your post. Again, be sure to format it as code
dput(head(your_dataframe_name, 10))
If your data set can't be shared, run this line instead and paste the output into your post (and yes, format as code!) This will still share some information about your data. If it's truly confidential, I'm afraid you'll need to make a fake sample dataset to share.