在下面的工作示例中,即使设置fullrange参数TRUE,平滑线也会限制自己,并且我得到缺失值警告(这在我们在每个geom_smooth()函数中本地设置新数据范围时确实有意义)。
# convert time series to data.frame, conserving date info
sb <- data.frame(Seatbelts, date = time(Seatbelts))
# convert from ts to date
library(lubridate)
sb$date <- as_date(date_decimal(as.numeric(sb$date)))
# store seatbelt law date
law <- ymd(19830131)
# plot
library(ggplot2)
ggplot(sb) + aes(x = date, y = front) +
geom_line() +
geom_vline(xintercept = law, colour = "red") +
geom_smooth(data = sb[sb$date < law,],
fullrange = TRUE) +
geom_smooth(data = sb[sb$date > law,],
fullrange = TRUE)
当前结果:平滑线条不会跨越整个范围
警告信息:
Warning messages:
1: Removed 10 rows containing missing values (geom_smooth).
2: Removed 71 rows containing missing values (geom_smooth).
(目前使用ggplot2 3.1.0和R 3.5.2)
编辑:因为我认为问题是数据的初步子集,我也试过这个更干净的版本,但无济于事:
# add before/after
sb$relative <- ifelse(sb$date < law, "before", "after")
# plot v.2
ggplot(sb) + aes(x = date, y = front) +
geom_line() +
geom_vline(xintercept = law, colour = "red") +
geom_smooth(aes(colour = relative),
fullrange = TRUE)
解决办法:默认情况下
dates <- seq(as.Date("1960-01-01"), law, by = "1 day")
head(setNames(predict(
loess(front ~ as.numeric(date), data = sb[sb$date < law, ]),
data.frame(date = as.numeric(dates))), dates))
1960-01-01 1960-01-02 1960-01-03 1960-01-04 1960-01-05 1960-01-06
NA NA NA NA NA NA
其行为在?predict.loess(粗面矿)中解释
当使用'surface =“interpolate”'(默认值)进行拟合时,'predict.loess'将不会外推 - 因此封闭原始数据的轴对齐超立方体外 的点将具有 缺失('NA')预测和标准错误。
因此,为了推断出用于LOESS模型的点范围之外的点,我们可以control = loess.control(surface = "direct")在内部使用loess。
不幸的是,这意味着我们需要手动执行两个LOESS拟合,预测两个感兴趣区域的值,并绘制所有内容。
暂无数据