Hi. What I am trying to achieve
I have 260 demand points (postal codes/GPS coordinates) with an associated daily demand in kg. I have trucks with a capacity of 12000 kg each. I want to cluster the data, so that one truck is able to satisfy the demand of each cluster (in order to achieve high capacity utilization), and so that the distance between the demand points is as small as possible. So basically I am trying to cluster the demand points so that the points near each other are clustered, but with a constraint of 12000 kg (see example of the data below)
I have tried searching all over the internet, but I am not finding anything specifically on how to achieve this. I previously made a succesful cluster analysis only based on the coordinates, but I really need some advice on how to incorporate this volume constraint. Thank you in advance!
perhap you can share that approach here, if you are hoping to augment it rather than throw it away and start from scratch, this would seem to save a forum user that might try to help you from reinventing the wheel and doing more work to support you than might be strictly necessary.
I would think that the GPS coordinates would be critical data, as the postal codes alone wont serve the task, so I would assume that the example data you provided is insufficient for me to begin working with from scratch even if I wished to, could you review that also ?
As far as I've understood from my research online, I won't be able to reuse what I already made, since I used k-means clustering and that doesn't allow for constraints. Also, unfortunately, due to being new in Rstudio I have not been able to figure out how to save the code I'm writing, so I only have the results.
I am not looking for someone to do the task for me since that would be too much to ask, I am only looking for advice on which direction to go in or example code that I can use.
Here is an example with GPS coordinates, and daily volume. The volume numbers are randomly generated, since I'm not allowed to share that detailed information with GPS coordinates and demand.
How about doing your traditional position-only cluster analysis.
Then summing up the weight of your final clusters.
Any final cluster that is over the max weight, you then look for how to split ?
It would then be a question of playing with the hyperparameters of your initial clustering, and your second split phase, to find an optimal balance. I'd maybe just use optim() for that, meta process
I think that can be a solution if I don't find a more structured approach. However, my previous k-means clustering analysis which was based only on location resulted in only four clusters.
The total daily volume is around 600,000 kg, which equals to about 50 clusters, if each cluster has a volume constraint of 12,000 kg. So it will probably be really complex to divide them manually, since the volume can also vary from demand point to demand point. I'm really unfamiliar with R and pressured on time, so this process would be me exporting the clusters to excel and trying to divide them in there, probably not what you were suggesting.
Just wanted to let you know that I've now had the chance to look through what you sent, and you really saved the day this was exactly what I was looking for, and with both raw data and code examples, thank you so much.