Stata函数

彭华@StataCorp

2017 Stata 中国用户大会

函数

  • Stata函数
  • Mata函数
  • egen函数

Stata函数

  • 组成expression
  • 用户无法扩展

Mata函数

  • Mata主要构成部分
  • 用户可扩展

egen函数

  • 通过egen命令生成新变量
  • 用户可扩展

Mata函数举例

mata:
real scalar add(real scalar n)
{
    real scalar i, res
    
    if(n<=0) {
        return(.)
    }
    
    res = 0 
    for(i=1; i<=n; i++) {
        res = res+i
    }
    
    return(res)
}
end

Mata函数计算距离矩阵

mata:
real matrix dist1(real matrix p)
{
        real scalar r, c
        real scalar i, j
        real matrix res
        
        r = rows(p)
        c = cols(p)
        
        if(r==0 && c != 2) {
            return(J(0, 0, .))
        }
        
        res = J(r, r, .)
        
        for(i=1; i<=r; i++) {
            for(j=1; j<=r; j++) {
                res[i, j] = sqrt(
                    (p[i,1]-p[j,1])*(p[i,1]-p[j,1])+
                    (p[i,2]-p[j,2])*(p[i,2]-p[j,2])
                        )
            }
        }
        return(res)
}
end

验证结果,测量运行时间

mata:
p = (1, 1)
r = dist1(p)
assert(r == 0)

p = (1, 1\2, 2)
r = dist1(p)
assert(r == (0, sqrt(2)\sqrt(2), 0))

rseed(12345)
p0 = runiform(2000, 2)
timer_clear()
timer_on(1)
r1 = dist1(p0)
timer_off(1)
timer()
end

precision

. clear

. set obs 10
number of observations (_N) was 0, now 10

. gen x = 1.1

. count if x == 1.1
  0

. count if x == float(1.1)
  10

. display %21x x[1]
+1.19999a0000000X+000

. display %21x 1.1
+1.199999999999aX+000

. display %21x float(1.1)
+1.19999a0000000X+000
http://blog.stata.com/2012/04/02/the-penultimate-guide-to-precision/

改善

mata:
real matrix dist2(real matrix p)
{
        real scalar r, c
        real scalar i, j
        real matrix res
        
        r = rows(p)
        c = cols(p)
        
        if(r==0 && c != 2) {
            return(J(0, 0, .))
        }
        
        res = J(r, r, .)
        
        for(i=1; i<=r; i++) {
            for(j=1; j<i; j++) {
                res[i, j] = sqrt(
                    (p[i,1]-p[j,1])*(p[i,1]-p[j,1])+
                    (p[i,2]-p[j,2])*(p[i,2]-p[j,2])
                        )
            }           
            res[i,i] = 0
        }
        _makesymmetric(res)
        return(res)
}
end

验证结果, 测量运行时间

mata:
p = (1, 1)
r = dist2(p)
assert(r == 0)

p = (1, 1\2, 2)
r = dist2(p)
assert(mreldif(r, (0, sqrt(2)\sqrt(2), 0))<1e-10)

timer_on(2)
r2 = dist2(p0)
timer_off(2)
timer()
assert(mreldif(r2, r1) < 1e-8)
end

进一步改善

mata:
real matrix dist3(real matrix p)
{
        real scalar r, c
        real scalar i, j
        real matrix res, cross
        
        r = rows(p)
        c = cols(p)
        
        if(r==0 && c != 2) {
            return(J(0, 0, .))
        }

        res = J(r, r, .)
        cross = cross(p', p')
        for(i=1; i<=r; i++) {
            for(j=1; j<i; j++) {
                res[i, j] = sqrt(cross[i,i] + 
                            cross[j,j] -
                            2*cross[i,j])
            }
            res[i,i] = 0
        }
        _makesymmetric(res)
        return(res)
}
end

结果

do-file结果

. timer list
   1:      3.01 /        1 =       3.0070
   2:      1.52 /        1 =       1.5190
   3:      0.71 /        1 =       0.7130

谢谢