R in common use functions for numeric

测试一下R一些常用的数字处理函数.

如下 :

round, 四舍五入到小数点后几位.

例如

> round(x,1)
      x   y   z
1   0.0 0.0 0.0
2   0.0 0.2 0.5
3   0.8 0.9 0.8
4   0.3 0.6 1.0
5   0.4 0.8 0.4
6   0.3 0.9 0.8

sort, 正向排序

rev, 返向排序 (注意要得到按顺序的反向排序, 需要rev(sort(?)))

> x <- 1:10
> y <- rep(x, each=2)
> y <- rep(x, times=2)
> y
 [1]  1  2  3  4  5  6  7  8  9 10  1  2  3  4  5  6  7  8  9 10
> sort(y)
 [1]  1  1  2  2  3  3  4  4  5  5  6  6  7  7  8  8  9  9 10 10
> rev(sort(y))
 [1] 10 10  9  9  8  8  7  7  6  6  5  5  4  4  3  3  2  2  1  1
> rev(y)
 [1] 10  9  8  7  6  5  4  3  2  1 10  9  8  7  6  5  4  3  2  1

rank和scale见:

http://blog.163.com/digoal@126/blog/static/163877040201533335228/

http://blog.163.com/digoal@126/blog/static/16387704020153343446995/

log(x, base) , 以base为底的x的对数. 默认是e

> log(100,2)
[1] 6.643856
> log(100,10)
[1] 2
> 2^6.643856
[1] 99.99999

pmin,

pmax,

多个向量, 每个位置对应处的, 最大值或最小值.

例如

> x
 [1]  1  2  3  4  5  6  7  8  9 10
> y <- rev(sort(x))
> x
 [1]  1  2  3  4  5  6  7  8  9 10
> y
 [1] 10  9  8  7  6  5  4  3  2  1
位置对应
> pmin(x,y)
 [1] 1 2 3 4 5 5 4 3 2 1
> pmax(x,y)
 [1] 10  9  8  7  6  6  7  8  9 10
 最好是长度一样的向量进行计算, 否则会告警.
> pmax(x,y,z)
 [1] 10 11 12 13 14 15 16 17 18 19 20 21 22
Warning message:
In pmax(x, y, z) : an argument will be fractionally recycled


cumsum, 截至到每个位置处的累计和
cumprod, 截至到每个位置处的累计积
cummin, 截至到每个位置处的最小值
cummax, 截至到每个位置处的最大值

> x
 [1]  1  2  3  4  5  6  7  8  9 10
> cumsum(x)
 [1]  1  3  6 10 15 21 28 36 45 55
1 , 1+2 , 1+2+3 , .....
> cumprod(x)
 [1]       1       2       6      24     120     720    5040   40320  362880
[10] 3628800
1, 1*2, 1*2*3, ....
> cummin(x)
 [1] 1 1 1 1 1 1 1 1 1 1
min(1), min(1,2), min(1,2,3),...
> cummax(x)
 [1]  1  2  3  4  5  6  7  8  9 10
max(1), max(1,2), max(1,2,3),...

match(x,y), 返回和第一个向量长度一致的向量, 表示x的值在Y中的位置索引.

> x
 [1]  5  6  7  8  9 10 11 12 13 14 15 16
> y
 [1] 10  9  8  7  6  5  4  3  2  1
> match(x,y)
 [1]  6  5  4  3  2  1 NA NA NA NA NA NA
5在y的位置是6. 即y[6]==x[1]

which (x == a) , x 是一个向量, a是一个值, 如果x中有值与a匹配, 那么返回索引位置.

> x
 [1] 10 10  9  9  8  8  7  7  6  6  5  5  4  4  3  3  2  2  1  1
> which(x == 1)
[1] 19 20
注意y不要使用向量
> which(x == c(1,3,100))
[1] 19
Warning message:
In x == c(1, 3, 100) :
  longer object length is not a multiple of shorter object length

choose(n,k) , 二项分布值.

lchoose(n,k) , 二项分布值的E为底的对数.

算法区分k>0, k=0, k<0的情况, 如下 :
 n(n-1)...(n-k+1) / k!,
 as 1 for k = 0
 as 0 for negative k.
k如果不是整数, 则取round值.
例子
> choose(4,3)
[1] 4
> (4*3*2)/(3*2*1)
[1] 4

lchoose则返回choose结果绝对值的对数.
> choose(10,2)
[1] 45
> lchoose(10,2)
[1] 3.806662
> log(45)
[1] 3.806662
参考帮助 :
     The functions ‘choose’ and ‘lchoose’ return binomial coefficients
     and the logarithms of their absolute values.  Note that ‘choose(n,
     k)’ is defined for all real numbers n and integer k.  For k >= 1
     it is defined as n(n-1)...(n-k+1) / k!, as 1 for k = 0 and as 0
     for negative k.  Non-integer values of ‘k’ are rounded to an
     integer, with a warning.

     ‘choose(*, k)’ uses direct arithmetic (instead of ‘[l]gamma’
     calls) for small ‘k’, for speed and accuracy reasons.  Note the
     function ‘combn’ (package ‘utils’) for enumeration of all possible
     combinations.

na.omit(x), 忽略向量中的NA值, 如果X是矩阵或数据框, 则忽略整行.

na.fail(x), 如果x 中包含NA值, 则返回错误.

> na.fail(c(1,2,3,NA))
Error in na.fail.default(c(1, 2, 3, NA)) : missing values in object

> na.omit(c(1,2,3,NA,4))
[1] 1 2 3 4
attr(,"na.action")
[1] 4
attr(,"class")
[1] "omit"

unique(x), 去重复数据.

> unique(c(1,1,1,1,2,3,4,4))
[1] 1 2 3 4
对于数据框, 则去除重复行
> x <- data.frame(rep(1:2, each=2),rep(1:2, each=2))
> x
  rep.1.2..each...2. rep.1.2..each...2..1
1                  1                    1
2                  1                    1
3                  2                    2
4                  2                    2
> unique(x)
  rep.1.2..each...2. rep.1.2..each...2..1
1                  1                    1
3                  2                    2

table(x) , 返回重复数据的个数的表格.

> x <- data.frame(rep(6:8, each=2),rep(5:7, each=2))
> x
  rep.6.8..each...2. rep.5.7..each...2.
1                  6                  5
2                  6                  5
3                  7                  6
4                  7                  6
5                  8                  7
6                  8                  7
> table(x)
                  rep.5.7..each...2.
rep.6.8..each...2. 5 6 7
                 6 2 0 0
                 7 0 2 0
                 8 0 0 2

table(x,y) ,

返回x,y的列联表, 输入值为两个factor, 并且长度必须一致

> x <- gl(1,10)
> y <- gl(10,2)
> x
 [1] 1 1 1 1 1 1 1 1 1 1
Levels: 1
> y
 [1] 1  1  2  2  3  3  4  4  5  5  6  6  7  7  8  8  9  9  10 10
Levels: 1 2 3 4 5 6 7 8 9 10
> table(x,y)
Error in table(x, y) : all arguments must have the same length
> y <- gl(10,1)
> table(x,y)
   y
x   1 2 3 4 5 6 7 8 9 10
  1 1 1 1 1 1 1 1 1 1  1
> mode(table(x,y))
[1] "numeric"

subset(x, ...), 给定条件的子集

> z
 [1] 10 11 12 13 14 15 16 17 18 19 20 21 22
> subset(z, z>1)
 [1] 10 11 12 13 14 15 16 17 18 19 20 21 22
> subset(z, z>15)
[1] 16 17 18 19 20 21 22

> x <- data.frame(1:10, 5:14, 6:15)
> x
   X1.10 X5.14 X6.15
1      1     5     6
2      2     6     7
3      3     7     8
4      4     8     9
5      5     9    10
6      6    10    11
7      7    11    12
8      8    12    13
9      9    13    14
10    10    14    15
> subset(x, x$X1.10>5)
   X1.10 X5.14 X6.15
6      6    10    11
7      7    11    12
8      8    12    13
9      9    13    14
10    10    14    15

sample(x, size), x可以是向量或列表, 从x中采样size个元素, 如果是列表, 采样表示列数.

> x
  rep.6.8..each...2. rep.5.7..each...2.
1                  6                  5
2                  6                  5
3                  7                  6
4                  7                  6
5                  8                  7
6                  8                  7
> mode(x)
[1] "list"
> length(x)
[1] 2
> sample(x, 1)
  rep.5.7..each...2.
1                  5
2                  5
3                  6
4                  6
5                  7
6                  7
> z
 [1] 10 11 12 13 14 15 16 17 18 19 20 21 22
> sample(z, 5)
[1] 21 13 22 14 15
另外还有两个可选参数prob, replace.
prob是一个长度和x的长度一致的向量, 每个位置对应采样倾向度, 越大则越可能被选择, 0表示排除.
例如 :
> z-10
 [1]  0  1  2  3  4  5  6  7  8  9 10 11 12
以下prob表示z的第一个元素10不可能被采样选择.
> sample(z, 5, prob=z-10)
[1] 16 13 12 20 17
replace=TRUE表示可能出现采样重复值.
> sample(z, 5, prob=z-10, replace=TRUE)
[1] 18 12 15 22 14
> sample(z, 5, prob=z-10, replace=TRUE)  , 因为21,22对应的prod最大, 可能性最大.
[1] 21 21 21 18 11

时间： 2024-11-02 22:47:22

R in common use functions for numeric的相关文章

(转) Read-through: Wasserstein GAN

Sorta Insightful Reviews Projects Archive Research About In a world where everyone has opinions, one man...also has opinions Read-through: Wasserstein GAN Feb 22, 2017 I really, really like the Wasserstein GAN paper. I know it's already gotten a lot

Improve your jQuery - 25 excellent tips[转载]

Improve your jQuery - 25 excellent tips 14 Dec 2008 | Jon Hobbs-Smith 原文链接:http://www.tvidesign.co.uk/blog/improve-your-jquery-25-excellent-tips.aspx 本文所有版权及其他相关权利,均归原作者所有. Introduction jQuery is awesome. I've been using it for about a year now and a

FORMS 6I OBJECT-BASED FEATURES (1)

object Forms 6i Object-Based Features Peter Koletzke, Quovera Objects in Mirror are Closer Than They Appear This warning-printed on automobile mirrors-is particularly appropriate to the development world. If you are not thinking about objects tod

instagram use solr instead postgresql gis

instagram的技术点可参考 : http://instagram-engineering.tumblr.com/post/13649370142/what-powers-instagram-hundreds-of-instances-dozens-of 从文章来看, 在地理位置搜索方面, 他们使用了solr. 也算专车专用吧. For our geo-search API, we used PostgreSQL for many months, but once our Media e

《Vertica的这些事》系列文章

HPVertica是一款MPP数据库,其列式存储对于OLAP分析很方便. HPE Vertica is the most advanced SQL database analytics portfolio built from the very first line of code to address the most demanding Big Data analytics initiatives. HPE Vertica delivers speed without compromise,

Android：下拉刷新+加载更多+滑动删除实例讲解_Android

小伙伴们在逛淘宝或者是各种app上,都可以看到这样的功能,下拉刷新和加载更多以及滑动删除,刷新,指刷洗之后使之变新,比喻突破旧的而创造出新的,比如在手机上浏览新闻的时候,使用下拉刷新的功能,我们可以第一时间掌握最新消息,加载更多是什么nie,简单来说就是在网页上逛淘宝的时候,我们可以点击下一页来满足我们更多的需求,但是在手机端就不一样了,没有上下页,怎么办nie,方法总比困难多,细心的小伙伴可能会发现,在手机端中,有加载更多来满足我们的要求,其实加载更多也是分页的一种体现.小伙

高质量 Android 开发框架 LoonAndroid 详解

整个框架式不同于androidannotations,Roboguice等ioc框架,这是一个类似spring的实现方式.在整应用的生命周期中找到切入点,然后对activity的生命周期进行拦截,然后插入自己的功能. 框架的说明如果你想看ui方面的东西,这里没有,想要看牛逼的效果这里也没有.这只是纯实现功能的框架,它的目标是节省代码量,降低耦合,让代码层次看起来更清晰.整个框架一部分是网上的,一部分是我改的,为了适应我的编码习惯,还有一部分像orm完全是网上的组件.在此感谢那些朋友们.整个框

解密-关于RSA非对称加密的问题。请各位大神帮我看下以下问题，新人求助。。。感激不尽

问题描述关于RSA非对称加密的问题.请各位大神帮我看下以下问题,新人求助...感激不尽关于RSA非对称加密的问题.编程语言采用的是C++ 现在需要设计一个用来加密解密程序.里面包含两个接口,一个是加密,一个是解密. rsa 对称加密是公钥和私钥进行加密,接口设计如下: int EncodeRSA(unsigned char pub_key,unsigned int pass_len, unsigned char data,unsigned int data_len,unsigned char

如何部署Python Web应用：记录一次Heroku部署完整过程

0.选择Heroku云平台伴随着云计算的浪潮,国内的云服务可谓多种多样,价格虽然不高,但是真正能够提供永久免费使用的,哪怕有些限制也好,似乎也找不到. 出于学习/研究/实验/测试或是真正应用等各种需求,这时我们不妨使用把应用部署到国外的Heroku云平台上,Heroku的免费版除了其提供的Postgres数据库有限制之外(小于1万条记录的小型数据库不用付费就可以添加到自己的Web应用上),其它都可以免费使用,这绝对是不二的选择. 另外一点来说,Heroku对P