Hello,
I have a data frame with real-estate properties data, which contain errors because were entered manually by people. I created column "price_m2_controll" with intention to use it for filtering-out false records.
I have created these rules:
# Filter-out rules for column "price_m2_controll"
#
# for RESIDENTIAL & COMMERCIAL property types
# - sale: >100€/m2 & <20000€/m2
# - rent: >1€/m2/month & <100€/m2/month
#
# for LAND property type
# - sale: >0.1€/m2 & <5000€/m2
# - rent: >0.01€/m2/month & <10€/m2/month
I want to filter out and drop all records outside these rules and keep only those which fit.
Sample of simplified df
records <- tribble(
~property_type, ~operation_type, ~price_m2_controll,
"residential", "rent", 0.5,
"residential", "rent", 5,
"residential", "rent", 200,
"residential", "sale", 20,
"residential", "sale", 1000,
"residential", "sale", 25000,
"commercial", "rent", 0.25,
"commercial", "rent", 48,
"commercial", "rent", 180,
"commercial", "sale", 9,
"commercial", "sale", 222,
"commercial", "sale", 28000,
"land", "rent", 0.005,
"land", "rent", 5,
"land", "rent", 12,
"land", "sale", 0.1,
"land", "sale", 1000,
"land", "sale", 60000
)
A tibble: 18 × 3
property_type operation_type price_m2_controll
<chr> <chr> <dbl>
1 residential rent 0.5
2 residential rent 5
3 residential rent 200
4 residential sale 20
5 residential sale 1000
6 residential sale 25000
7 commercial rent 0.25
8 commercial rent 48
9 commercial rent 180
10 commercial sale 9
11 commercial sale 222
12 commercial sale 28000
13 land rent 0.005
14 land rent 5
15 land rent 12
16 land sale 0.1
17 land sale 1000
18 land sale 60000
This is my filtering code
records <- records %>%
# Residential & commercial (other) - SALE
filter( property_type != "land" | operation_type == "sale" & price_m2_controll > 100 & price_m2_controll < 20000) %>%
# Residential & commercial - RENT
filter(property_type != "land" | operation_type == "rent" & price_m2_controll > 1 & price_m2_controll < 100 ) %>%
# Land - SALE
filter(property_type == "land"| operation_type == "sale" & price_m2_controll > 0.1 & price_m2_controll < 10) %>%
# Land - RENT
filter(property_type = "land" | operation_type == "rent" & price_m2_controll > 0.01 & price_m2_controll < 100)
I am stuck in this , I cant figure out the filtering rule to get correct results. I will appreciate some help please...
Thanks in advance