I have two columns in a data frame occ1 and occ2 and I want to know their similarity. Like first sent in occ1 against all sentences in occ2, second sent in occ1 against all sentences occ2...
Cosine similarity or jw similarity
OCC1 = c(" Appoint department heads or managers and assign or delegate responsibilities to them", "Directing or coordinating business activities involved in the purchase or sale of investment products or financial services", "Analyze operations to assess the performance of a company or its staff in meeting objectives or to determine areas of potential cost reduction, program improvement, or policy change", "Directing, planning or implementing policies, objectives or activities of organizations or businesses to ensure continuity of operations, maximize return on investment or increase productivity", "Negotiate or approve contracts or agreements with suppliers, distributors, federal or state agencies or other organizational entities", "Coordinate the development or implementation of budget control systems, record keeping systems or other administrative control processes")
OCC2 = c("Define unit to participate in the production process", "Recommend types of investments to make", " Analyze political-economic, national and international trends", " Analyze industry of potential customers", " Implement human resources development policy", " Supervise the execution of commercial, industrial, administrative and financial activity plans", "Discuss results and their corrections with direct reports", "manage conflicts", " Manage the implementation of the quality system"," analyze scenarios", " Plan contracting services")
max_ln <- max(c(length(OCC1), length(OCC2)))
gfg_data<- data.frame(col1 = c(OCC1,rep(NA, max_ln - length(OCC1))),
col2 = c(OCC2,rep(NA, max_ln - length(OCC2))))
gfg_data
is.data.frame((gfg_data))