本文共 8440 字,大约阅读时间需要 28 分钟。
r语言时间序列图
In this tutorial, we’ll be going over how to create time series plots in R. Time series data refers to data points that represent a particular variable changing over different points of time. It can be thought of as a sequence of data that was recorded at regular time intervals.
在本教程中,我们将介绍如何在R中创建时间序列图。时间序列数据是指代表特定变量在不同时间点变化的数据点。 可以将其视为按固定时间间隔记录的数据序列。
Time series data is widely used in stock market analysis, weather analysis, market trend analysis and any other scenarios where data variations with time are important.
时间序列数据广泛用于股票市场分析,天气分析,市场趋势分析以及其他任何数据随时间变化很重要的场景。
R has several packages to perform time-series plotting and analysis tasks. Let us begin by acquiring some standard time series data for our work.
R有几个软件包可以执行时间序列绘图和分析任务。 让我们开始为我们的工作获取一些标准时间序列数据。
Several data scientists and organizations have open-sourced time series datasets that could be directly downloaded to the R environment. Two of these sources are:
一些数据科学家和组织已经开源了时间序列数据集,这些数据集可以直接下载到R环境中。 其中两个来源是:
The packages can be installed into your R environment using install.packages("packagename")
command. Other relevant instructions are present on the websites give above.
可以使用install.packages("packagename")
命令将软件包安装到R环境中。 其他相关说明也位于上述网站上。
Let us proceed with some data from the tsdl package for illustrating time series plotting.
让我们继续使用tsdl包中的一些数据来说明时间序列图。
The tsdl package has numerous data series across several categories. Let us try accessing some of these sets. The first step is to load the package into memory.
tsdl软件包具有多个类别的众多数据系列。 让我们尝试访问其中一些集合。 第一步是将程序包加载到内存中。
library(tsdl)tsdl
Time Series Data Library: 648 time series FrequencySubject 0.1 0.25 1 4 5 6 12 13 52 365 Total Agriculture 0 0 37 0 0 0 3 0 0 0 40 Chemistry 0 0 8 0 0 0 0 0 0 0 8 Computing 0 0 6 0 0 0 0 0 0 0 6 Crime 0 0 1 0 0 0 2 1 0 0 4 Demography 1 0 9 2 0 0 3 0 0 2 17 Ecology 0 0 23 0 0 0 0 0 0 0 23 Finance 0 0 23 5 0 0 20 0 2 1 51 Health 0 0 8 0 0 0 6 0 1 0 15 Hydrology 0 0 42 0 0 0 78 1 0 6 127 Industry 0 0 9 0 0 0 2 0 1 0 12 Labour market 0 0 3 4 0 0 17 0 0 0 24 Macroeconomic 0 0 18 33 0 0 5 0 0 0 56 Meteorology 0 0 18 0 0 0 17 0 0 12 47 Microeconomic 0 0 27 1 0 0 7 0 1 0 36 Miscellaneous 0 0 4 0 1 1 3 0 1 0 10 Physics 0 0 12 0 0 0 4 0 0 0 16 Production 0 0 4 14 0 0 28 1 1 0 48 Sales 0 0 10 3 0 0 24 0 9 0 46 Sport 0 1 1 0 0 0 0 0 0 0 2 Transport and tourism 0 0 1 1 0 0 12 0 0 0 14 Tree-rings 0 0 34 0 0 0 1 0 0 0 35 Utilities 0 0 2 1 0 0 8 0 0 0 11 Total 1 1 300 64 1 1 240 3 16 21 648>
Let us try choosing a time-series for our plotting. We first create a subset of the above dataset using the subset function for the respective category.
让我们尝试为绘图选择时间序列。 我们首先使用各个类别的子集函数创建上述数据集的子集。
crime <-subset(tsdl,'Crime')
Now, in order to access the time series, we need to index the data frame created above. This particular time series represents the number of monthly armed robberies in Boston from Jan 1965 to Oct 1977.
现在,为了访问时间序列,我们需要索引上面创建的数据帧。 这个特定的时间序列代表了1965年1月至1977年10月在波士顿发生的每月武装抢劫案的数量。
crime[[2]]
> crime[[2]] Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec1966 41 39 50 40 43 38 44 35 39 35 29 491967 50 59 63 32 39 47 53 60 57 52 70 901968 74 62 55 84 94 70 108 139 120 97 126 1491969 158 124 140 109 114 77 120 133 110 92 97 781970 99 107 112 90 98 125 155 190 236 189 174 1781971 136 161 171 149 184 155 276 224 213 279 268 2871972 238 213 257 293 212 246 353 339 308 247 257 3221973 298 273 312 249 286 279 309 401 309 328 353 3541974 327 324 285 243 241 287 355 460 364 487 452 3911975 500 451 375 372 302 316 398 394 431 431
We now create a time series object from this using the function.
现在,我们使用函数从该创建一个时间序列对象。
series <- ts(crime[[2]])series
Time Series:Start = 1 End = 118 Frequency = 1 [1] 41 39 50 40 43 38 44 35 39 35 29 49 50 59 63 32 39 47 53 60 [21] 57 52 70 90 74 62 55 84 94 70 108 139 120 97 126 149 158 124 140 109 [41] 114 77 120 133 110 92 97 78 99 107 112 90 98 125 155 190 236 189 174 178 [61] 136 161 171 149 184 155 276 224 213 279 268 287 238 213 257 293 212 246 353 339 [81] 308 247 257 322 298 273 312 249 286 279 309 401 309 328 353 354 327 324 285 243[101] 241 287 355 460 364 487 452 391 500 451 375 372 302 316 398 394 431 431attr(,"source")[1] McCleary & Hay (1980)attr(,"description")[1] Monthly Boston armed robberies Jan.1966-Oct.1975 Deutsch and Alt (1977)attr(,"subject")[1] Crime
The ts() function converts a numeric vector into a time series object. The syntax is as follows:
ts()函数将数值向量转换为时间序列对象。 语法如下:
ts(vector, start, end, frequencY)
You can choose to convert only a part of the time series instead of the whole series by selecting the start and endpoints from the whole series.
通过从整个序列中选择起点和终点,您可以选择只转换一部分时间序列,而不转换整个序列。
We can retrieve only the crime data from 1970 January to 1972 December using the following command:
我们可以使用以下命令仅检索1970年1月至1972年12月的犯罪数据:
> shortseries <-ts(crime[[2]], start=c(1970,1), end=c(1983,12))> shortseriesTime Series:Start = 1970 End = 1983 Frequency = 1 [1] 41 39 50 40 43 38 44 35 39 35 29 49 50 59
The frequency option indicates how often the observations are to be made. 1 indicates annual, 4 indicates quarterly and so on. By default, frequency takes one observation per year by calculating the mean of all observations.
频率选项指示多久进行一次观测。 1表示年度,4表示季度,依此类推。 默认情况下,频率通过计算所有观测值的平均值每年进行一次观测。
If we need more fine-grained observations, we need to specify 12 as the frequency (one observation every month).
如果需要更多细粒度的观察,则需要指定12个作为频率(每月观察一次)。
R provides plot.ts() function to plot time-series graphs. Let us re-examine our series data.
R提供plot.ts()函数来绘制时间序列图。 让我们重新检查我们的系列数据。
series <- ts(crime[[2]])plot.ts(series)
Since this series was not specified with a start and end date, the plot will just display the observation number instead of the year number.
由于未为该系列指定开始日期和结束日期,因此该图将仅显示观察编号而不是年份编号。
We are now going to redefine the series object with starting and ending dates and frequency set to 12.
现在,我们将重新定义系列对象,将开始日期和结束日期以及频率设置为12。
series <- ts(crime[[2]], start =c(1966,1), end=c(1975,12),frequency = 12)plot.ts(series)
It is possible to further analyze the time series by using decomposition. These additional pieces of information can be separately plotted as 3 different plots along with the observed plot:
通过分解可以进一步分析时间序列。 这些额外的信息可以与观察到的图一起分别绘制为3个不同的图:
This information can be derived from a series using the decompose()
function as follows.
可以使用decompose()
函数从一系列信息中得出以下信息。
decseries <-decompose(series)
The result is a list of all the above components of the series. These can be plotted using a plot() function directly.
结果是该系列所有上述组件的列表。 这些可以直接使用plot()函数绘制。
plot(decseries)
From the graph, it can be observed that there is a seasonality in the crimes being performed, and the trend is generally on the rise.
从图中可以看出,所犯罪行具有季节性,并且这种趋势总体上呈上升趋势。
Time series plots are an important means of data analysis for sequential and time-varying data. R functionalities like those mentioned above make the tasks easier.
时间序列图是对顺序数据和时变数据进行数据分析的重要手段。 像上面提到的那些功能使任务更加容易。
翻译自:
r语言时间序列图
转载地址:http://jgozd.baihongyu.com/