1. What is machine learning? 2. Show each iteration of knn

  1. I believe the data analysis techniques that we discuss on
    this board fall under the general category of machine learning.

One definition of machine learning is:

"Machine learning is a branch of artificial intelligence (AI) and
computer science which focuses on the use of data and algorithms to
imitate the way that humans learn, gradually improving its accuracy."

Let's accept the above definition, especially about "gradually improving
its accuracy."

Then wouldn't you say that most data analysis techniques methods do not,
on their own, improve their accuracy? For example linear regression does
not improve its accuracy - it finds the single best straight line that
minimizes squared residuals, and it is done. Similarly, a decision tree
makes a best decision on each node, gets to the bottom of the tree, and
stops. We as the human can attempt to improve the accuracy (tune), but
the technique does not do this on its own.

So I have some uneasiness about the definition of machine learning.

I think a simple example of a method that gradually improves its accuracy
is Newton-Raphson. Would this qualify as a machine learning technique?

The k-means method is one that does improve its accuracy by iteratively
attempting the reassignment of observations to clusters by minimizing total
within sum of squares.

  1. Here is some code for k-means. How can I modify it to show the same
    information separately for each iteration?

library(factoextra)
data("USArrests")
arrests.scaled <- scale(USArrests)
km.res <- kmeans(arrests.scaled, 4, nstart = 1)
aggregate(USArrests, by=list(cluster=km.res$cluster), mean)
fviz_cluster(km.res, USArrests[, 1:4], ellipse.type = "norm")

for the last part maybe this :

library(factoextra)
library(purrr)
library(cowplot)
data("USArrests")
arrests.scaled <- scale(USArrests)
all_iters <- map(1:10,\(im){
  km.res <- kmeans(arrests.scaled, 4, nstart = 1,iter.max=im)
  aggregate(USArrests, by=list(cluster=km.res$cluster), mean)
  fviz_cluster(km.res, USArrests[, 1:4], ellipse.type = "norm")+
    ggtitle(paste0("iterations: ",im))
})

save_plot(
  filename="kmeans.png",
  plot = plot_grid(plotlist = all_iters,ncol=1),
  ncol=1,
  nrow=10,
  base_height = 6,
  base_width=12,
  limitsize=FALSE
  )

For 1, I would disagree with the definition: the gradually is not really necessary, I suspect they included it to emphasize the similarity to human learning. Actually, the way the sentence is written, it could be read "ML focuses on the use of data to learn, imitating the way humans can use data to learn gradually".

A quick search gives numerous definitions of ML, e.g. Merriam-Webster, that do not include the word "gradually".

I suspect the definition was written by someone exposed to Deep Learning (which usually learns in epochs), with the goal of describing what they're selling rather than trying to give the best general definition.

At a first glance, I would not put that in ML because of the lack of data. One could argue that the function itself is data, I feel that's getting far from the standard use of "ML"; one could use Newton-Raphson in an ML algorithm, that doesn't make NR itself ML.

The core of ML is that you have some model (e.g. a linear model, a neural network, or a number of clusters), that has some parameters (e.g. the slope and intersect, the neuron weights, or the center of the clusters), and you have data. The "learning" part is that you have an algorithm such that, using some metric (e.g. sum of squares, L1 loss, or within-cluster squared distance), the machine selects the best parameters that fit the data for that metric. The exact way the algorithm functions (gradually or not) is irrelevant.

Alexis W, thank you.

I am still confused about the "learning" part of the definition.

How is machine learning different than what we think of as data analysis techniques? Do you have separate definitions of each of these that you like?

I think the perfect answer is this paper from Leo Breiman (quite readable). And here some modern comments.

I recommend reading it, but in short the difference between ML and statistics is more political and historical than some kind of intrinsic difference in methods. Methods like linear regression can be used both as a statistical tool (e.g. interpreting the coefficients to infer something about the population) and as a ML tool (e.g. predicting a value based on measurements).

Basically statistics is more concerned about inferring something about the population your data comes from, ML doesn't care where the data comes from and what algorithm you use as long as you can maximize some kind of accuracy (very simplified). The separation is fuzzy and porous.

For definitions, I don't see an obvious problem with the Merriam-Webster ones:

statistics
a branch of mathematics dealing with the collection, analysis, interpretation, and presentation of masses of numerical data

machine learning
a computational method that is a subfield of [artificial intelligence](Artificial Definition & Meaning - Merriam-Webster intelligence) and that enables a computer to learn to perform tasks by analyzing a large dataset without being explicitly programmed

Arguably ML is a branch of statistics, which is a branch of mathematics; in practice different people do each, with different job titles/academic departments, but that's partly historical.

Thank you to both of you.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.