python aggregate函数数是什么意思

The page is temporarily unavailable
nginx error!
The page you are looking for is temporarily unavailable.
Please try again later.
Website Administrator
Something has triggered an error on your
This is the default error page for
nginx that is distributed with
It is located
/usr/share/nginx/html/50x.html
You should customize this error page for your own
site or edit the error_page directive in
the nginx configuration file
/etc/nginx/nginx.conf.The page is temporarily unavailable
nginx error!
The page you are looking for is temporarily unavailable.
Please try again later.
Website Administrator
Something has triggered an error on your
This is the default error page for
nginx that is distributed with
It is located
/usr/share/nginx/html/50x.html
You should customize this error page for your own
site or edit the error_page directive in
the nginx configuration file
/etc/nginx/nginx.conf.Spark函数讲解:aggregateByKey – 过往记忆
欢迎关注Hadoop、Hive、Hbase、Flume等微信公共账号:iteblog_hadoop。
文章总数:688
浏览总数:7,377,347
评论:3971
分类目录:80 个
注册用户数:1613
最后更新:日
欢迎关注微信公共帐号:iteblog_hadoop
  该函数和aggregate类似,但操作的RDD是Pair类型的。 1.1.0版本才正式引入该函数。官方文档定义:
Aggregate the values of each key, using given combine functions and a neutral "zero value". This function can return a different result type, U, than the type of the values in this RDD, V. Thus, we need one operation for merging a V into a U and one operation for merging two U's, as in scala.TraversableOnce. The former operation is used for merging values within a partition, and the latter is used for merging values between partitions. To avoid memory allocation, both of these functions are allowed to modify and return their first argument instead of creating a new U.
  aggregateByKey函数对PairRDD中相同Key的值进行聚合操作,在聚合过程中同样使用了一个中立的初始值。和aggregate函数类似,aggregateByKey返回值的类型不需要和RDD中value的类型一致。因为aggregateByKey是对相同Key中的值进行聚合操作,所以aggregateByKey函数最终返回的类型还是Pair RDD,对应的结果是Key和聚合好的值;而aggregate函数直接是返回非RDD的结果,这点需要注意。在实现过程中,定义了三个aggregateByKey函数原型,但最终调用的aggregateByKey函数都一致。
def aggregateByKey[U: ClassTag](zeroValue: U, partitioner: Partitioner)
    (seqOp: (U, V) =& U, combOp: (U, U) =& U): RDD[(K, U)]
def aggregateByKey[U: ClassTag](zeroValue: U, numPartitions: Int)
    (seqOp: (U, V) =& U, combOp: (U, U) =& U): RDD[(K, U)]
def aggregateByKey[U: ClassTag](zeroValue: U)
    (seqOp: (U, V) =& U, combOp: (U, U) =& U): RDD[(K, U)]
  第一个aggregateByKey函数我们可以自定义Partitioner。除了这个参数之外,其函数声明和aggregate很类似;其他的aggregateByKey函数实现最终都是调用这个。
  第二个aggregateByKey函数可以设置分区的个数(numPartitions),最终用的是HashPartitioner。
  最后一个aggregateByKey实现先会判断当前RDD是否定义了分区函数,如果定义了则用当前RDD的分区;如果当前RDD并未定义分区 ,则使用HashPartitioner。
* User: 过往记忆
* Date: 15-03-02
* Time: 上午06:30
* 本文地址:/archives/1261
* 过往记忆博客,专注于hadoop、hive、spark、shark、flume的技术博客,大量的干货
* 过往记忆博客微信公共帐号:iteblog_hadoop
scala& var data = sc.parallelize(List((1,3),(1,2),(1, 4),(2,3)))
scala& def seq(a:Int, b:Int) : Int ={
| println(&seq: & + a + &\t & + b)
| math.max(a,b)
seq: (a: Int, b: Int)Int
scala& def comb(a:Int, b:Int) : Int ={
| println(&comb: & + a + &\t & + b)
comb: (a: Int, b: Int)Int
scala& data.aggregateByKey(1)(seq, comb).collect
res62: Array[(Int, Int)] = Array((1,9), (2,3))
  细心的读者肯定发现了aggregateByKey和aggregate结果有点不一样。如果用aggregate函数对含有3、2、4三个元素的RDD进行计算,初始值为1的时候,计算的结果应该是10,而这里是9,这是因为aggregate函数中的初始值需要和reduce函数以及combine函数结合计算,而aggregateByKey中的初始值只需要和reduce函数计算,不需要和combine函数结合计算,所以导致结果有点不一样。
本博客文章除特别声明,全部都是原创!
禁止个人和公司转载本文、谢谢理解:
下面文章您可能感兴趣您的位置: >
LINQ中Aggregate的用法
学习标签:
本文导读:LINQ中的Aggregate可用于集合的简单的累加、阶乘和一些更加复杂的运算,Aggregate配合lambda让原来需要很多行代码才能实现的功只要很少的代码就搞定。下面介绍LINQ中Aggregate的简单用法
LINQ中的Aggregate可以做一些复杂的聚合运算,例如累计求和,累计求乘积。
Aggregate接受2个参数,一般第一个参数是称为累积数(默认情况下等于第一个值),而第二个代表了下一个值。第一次计算之后,计算的结果会替换掉第一个参数,继续参与下一次计算。
一、Aggregate的源码分析
Aggregate在内部其实也仅仅是&普通做法&一模一样的源代码,Aggregate做的是一层代码封装
二、LINQ中的Aggregate实例
1、使用Aggregate语法做阶乘运算
C# 代码 &&复制
using System.Collections.G
using System.L
using System.T
namespace ConsoleApplication1
class Program
static void Main(string[] args)
var numbers = GetArray(<span style="color: #);
var result = (from n in numbers
select n).Aggregate(
(total, next) =&
return total *
Console.WriteLine(&<span style="color: #的阶乘为:{0}&,result);//返回120,也就是1*2*3*4*5
static IEnumerable&int& GetArray(int max) ...{
List&int& result = new List&int&(max);
for (int i = <span style="color: #; i & i++)
result.Add(i+<span style="color: #);
2、Aggregate用于集合的简单的累加、阶乘
C# 代码 &&复制
using System.L
class Program
static void Main()
int[] array = ...{ <span style="color: #, <span style="color: #, <span style="color: #, <span style="color: #, <span style="color: # };
int result = array.Aggregate((a, b) =& b + a);
// 1 + 2 = 3
// 3 + 3 = 6
// 6 + 4 = 10
// 10 + 5 = 15
Console.WriteLine(result);
result = array.Aggregate((a, b) =& b * a);
// 1 * 2 = 2
// 2 * 3 = 6
// 6 * 4 = 24
// 24 * 5 = 120
Console.WriteLine(result);
//输出结果:
//<span style="color: #
//<span style="color: #0
3、Aggregate,在字符串中反转单词的排序
C# 代码 &&复制
string sentence = &the quick brown fox jumps over the lazy dog&;
string[] words = sentence.Split(&#39; &#39;);
string reversed = words.Aggregate((workingSentence, next) =&next + & & + workingSentence);
Console.WriteLine(reversed);
//输出结果:
//dog lazy the over jumps fox brown quick the
4、使用&Aggregate&应用累加器函数和结果选择器
下例使用linq的Aggregate方法找出数组中大于&banana&, 长度最长的字符串,并把它转换大写。
C# 代码 &&复制
string[] fruits =...{
&apple&, &mango&, &orange&, &passionfruit&, &grape&
string longestName =
fruits.Aggregate(&banana&,
(longest, next) =&
next.Length & longest.Length ? next : longest,
fruit =& fruit.ToUpper());
Console.WriteLine(
&The fruit with the longest name is {0}.&,longestName);
//输出结果:
//The fruit with the longest name is PASSIONFRUIT.
5、使用&Aggregate&应用累加器函数和使用种子值
下例使用linq的Aggregate方法统计一个数组中偶数的个数。
C# 代码 &&复制
int[] ints = ...{ <span style="color: #, <span style="color: #, <span style="color: #, <span style="color: #, <span style="color: #, <span style="color: #, <span style="color: #, <span style="color: #, <span style="color: # };
//统计一个数组中偶数的个数,种子值设置为0,找到偶数就加1
int numEven = ints.Aggregate(<span style="color: #, (total, next) =&
next % <span style="color: # == <span style="color: # ? total + <span style="color: # : total);
Console.WriteLine(&The number of even integers is: {0}&, numEven);
//输出结果:
//The number of even integers is: 6
您可能感兴趣
一月好评排行榜The page is temporarily unavailable
nginx error!
The page you are looking for is temporarily unavailable.
Please try again later.
Website Administrator
Something has triggered an error on your
This is the default error page for
nginx that is distributed with
It is located
/usr/share/nginx/html/50x.html
You should customize this error page for your own
site or edit the error_page directive in
the nginx configuration file
/etc/nginx/nginx.conf.}

我要回帖

更多关于 excel aggregate函数 的文章

更多推荐

版权声明:文章内容来源于网络,版权归原作者所有,如有侵权请点击这里与我们联系,我们将及时删除。

点击添加站长微信