I have 2 tables:
df1
+-------+--------------+
| Id | Title |
+-------+--------------+
| 1 | AAA12345 |
| 2 | BBB3245 |
| 3 | CCC876 |
+-------+--------------+
df2
+-------+---------------+------------------------------------+
| Id | Sub | Body |
+-------+---------------+------------------------------------+
| 1 | CC878 | some mail body CC some text here |
| 2 | AZ5464 | some text here AZ continues here |
| 3 | BB93245 | some text BB present here |
| 4 | some sub4 | AAA string is present here also |
| 5 | some sub5 | CCC string is present here |
+-------+---------------+------------------------------------+
I want to match the Title
from df1 with the Sub
column of the df2 with partial or closest matching pattern.
The output data frame should be like:
df3
+--------------+---------------+------------------------------------+
| Title | Sub | Body |
+--------------+---------------+------------------------------------+
| CCC876 | CC878 | some mail body CC some text here |
| BBB3245 | BB93245 | some text here BB continues here |
+--------------+---------------+------------------------------------+
Some additional information: I receive data with a lot of typos and need to match their product/sub description to their closest match.
Much appreciated.