Prediction versus classification

Hi. I am interested in creating a model that will successfully predict which customers are interested in buying a particular project, and that will describe the common characteristics of those customers. Does it make sense to use two different algorithms, such as random forest for prediction and decision tree for classification? My confusion is that random forest does not provide classification rules, but a single decision tree is not optimal.



Are you familiarized with this kind of problems? First of all: To build a classification model you need a dataset of examples with the actual label, so the model can learn how to classify future unseen examples. This is, using a dataset where each row describes a customer (an example) and each column the attributes used for defining that customer, you need a bunch of objects with an additional class attribute that contains the value "yes" or "no" (for example) indicating whether the customer represented in that row bought the project. With this input data you can build the model (it could be a decision tree) using the known information. Then, with the classification model that you have built, you can predict if new customers are potential buyers.

Do you have that prior information?


Noelia, thank you for your response. I have the data, amd I have built a first attempt at a decision tree and a random forest. Am I correct that the forest does not give me classiifcation rules such as age <30? So I need the tree for such a rule. But a single tree is not optimal - that's why I built a forest.

What you descrive to me sounds like recommender systems, you can read about different approaches here, I think its a good introduction.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.