MAML小样本学习算法解读及代码意味着
2023-03-08 职场
有数学模型方式
MAML方式的体能训练借以是赢取四组最优的模板给定,使得有数学模型并不必需急剧可用(fast adaptation)新执行。所作显然,某些特质比另一些特质更容易迁到到其他执行中的,这些特质具有地区性执行间的操作性。既然小抽样研读执行只提供少量标记抽样,有数学模型在小抽样上多轮给定体能训练后显然造成过拟合,那么就应以尽可能使有数学模型只给定体能训练回头。这就决定有数学模型现在具有广泛可用于各种执行的模板给定,这组给定应以都有有数学模型在典范集上高深到的先验常识。
结论有数学模型可以用算子θ 同上示,θ 为有数学模型给定。可用新执行 时,有数学模型通过径向下滑规给定一步(或若干步),给定θ 备份为θ ,即 θ θ α θ
其中的, α 为激给定,用于高度集中的可用操作过程的研读赴援。
在多个完全完全相同执行上,有数学模型通过推算θ 的人员伤亡来分析报告有数学模型给定 θ 。具体地,元研读的远距离是赢取四组给定θ ,使得有数学模型在执行分布区上,并不必需急剧可用所有执行,使得人员伤亡最小。用公式同上达如下:
通过随机径向下滑(SGD)规,有数学模型给定 θ 按照以下公式收尾备份:
这里必需注意,我们就此要最佳化的给定是 θ ,但推算人员伤亡算子却是在这两项后的给定θ 上收尾,体能训练操作过程可通过下示意图示意。
由于上述元研读方式在人员伤亡推算和最佳化给定方面的特点,体能训练包含了两层可逆。致密可逆是元研读操作过程,通过在执行分布区上均值四组执行,推算在这组执行上的人员伤亡算子;中间层可逆是这两项操作过程,即针对每一个执行,给定一次(或若干次)径向下滑,将给定收尾备份为θ ,然后推算在给定为 θ 时的人员伤亡。径向排外向传递信息时,必需地区性过两层可逆传递信息到初始给定θ上,收尾元研读的给定备份。
完整的MAML方式如下示意图上示意图。
试验中的结果
在Omniglot和miniImageNet信息集上,文献证明了的试验中的结果如下示意图上示意图。
飞桨借助于
本副歌证明了本人在“飞桨专著复现系列赛(第三期)”中的收尾的一小不可或缺编译机。完整计划编译机已在GitHub和AI Studio上Apache,爱戴旁观者star、fork。绑定如下:
GitHub地址:
AI Studio地址:
不可或缺编译机借助于
该有数学模型比较类似于,径向必需穿过皆两层可逆传递信息到更早给定。如果基于nn.Layer类收尾如前所述的有数学模型搭起,在内可逆备份径向时,有数学模型给定不会被覆盖,造成初始给定被窃。得益于飞桨动态示意图的系统灵活组网的特点,本计划将有数学模型给定和算子分离设计,在外可逆中的保存更早给定复制 θ ;内可逆中的通过该复制备份给定,推算人员伤亡算子。推算示意图通过动态示意图的系统自动构筑,就此将径向排外传回更早给定θ 。
MAML类的编译机如下:
1classMAML(paddle.nn.Layer):
2def originallyinitoriginally(self, n_way):
3super(MAML, self).originallyinitoriginally
4# 同上述有数学模型中的全部待最佳化给定
5self.vars = []
6self.vars_bn = []
7# ------------------------第1个conv2d-------------------------
8weight = paddle. static.create_parameter(shape=[ 64, 1, 3, 3],
9dtype= 'float32',
10default_initializer=nn.initializer.KaimingNormal,
11is_bias= False)
12bias = paddle. static.create_parameter(shape=[ 64],
13dtype= 'float32',
14is_bias= True) # 模板为零
15self.vars.extend([weight, bias])
16# 第1个BatchNorm
17weight = paddle. static.create_parameter(shape=[ 64],
18dtype= 'float32',
19default_initializer=nn.initializer.Constant(value= 1),
20is_bias= False)
21bias = paddle. static.create_parameter(shape=[ 64],
22dtype= 'float32',
23is_bias= True) # 模板为零
24self.vars.extend([weight, bias])
25running_mean = paddle.to_tensor(np.zeros([ 64], np.float32), stop_gradient= True)
26running_var = paddle.to_tensor(np.zeros([ 64], np.float32), stop_gradient= True)
27self.vars_bn.extend([running_mean, running_var])
28# ------------------------第2个conv2d------------------------
29weight = paddle. static.create_parameter(shape=[ 64, 64, 3, 3],
30dtype= 'float32',
31default_initializer=nn.initializer.KaimingNormal,
32is_bias= False)
33bias = paddle. static.create_parameter(shape=[ 64],
34dtype= 'float32',
35is_bias= True)
36self.vars.extend([weight, bias])
37# 第2个BatchNorm
38weight = paddle. static.create_parameter(shape=[ 64],
39dtype= 'float32',
40default_initializer=nn.initializer.Constant(value= 1),
41is_bias= False)
42bias = paddle. static.create_parameter(shape=[ 64],
43dtype= 'float32',
44is_bias= True) # 模板为零
45self.vars.extend([weight, bias])
46running_mean = paddle.to_tensor(np.zeros([ 64], np.float32), stop_gradient= True)
47running_var = paddle.to_tensor(np.zeros([ 64], np.float32), stop_gradient= True)
48self.vars_bn.extend([running_mean, running_var])
49# ------------------------第3个conv2d------------------------
50weight = paddle. static.create_parameter(shape=[ 64, 64, 3, 3],
51dtype= 'float32',
52default_initializer=nn.initializer.KaimingNormal,
53is_bias= False)
54bias = paddle. static.create_parameter(shape=[ 64],
55dtype= 'float32',
56is_bias= True)
57self.vars.extend([weight, bias])
58# 第3个BatchNorm
59weight = paddle. static.create_parameter(shape=[ 64],
60dtype= 'float32',
61default_initializer=nn.initializer.Constant(value= 1),
62is_bias= False)
63bias = paddle. static.create_parameter(shape=[ 64],
64dtype= 'float32',
65is_bias= True) # 模板为零
66self.vars.extend([weight, bias])
67running_mean = paddle.to_tensor(np.zeros([ 64], np.float32), stop_gradient= True)
68running_var = paddle.to_tensor(np.zeros([ 64], np.float32), stop_gradient= True)
69self.vars_bn.extend([running_mean, running_var])
70# ------------------------第4个conv2d------------------------
71weight = paddle. static.create_parameter(shape=[ 64, 64, 3, 3],
72dtype= 'float32',
73default_initializer=nn.initializer.KaimingNormal,
74is_bias= False)
75bias = paddle. static.create_parameter(shape=[ 64],
76dtype= 'float32',
77is_bias= True)
78self.vars.extend([weight, bias])
79# 第4个BatchNorm
80weight = paddle. static.create_parameter(shape=[ 64],
81dtype= 'float32',
82default_initializer=nn.initializer.Constant(value= 1),
83is_bias= False)
84bias = paddle. static.create_parameter(shape=[ 64],
85dtype= 'float32',
86is_bias= True) # 模板为零
87self.vars.extend([weight, bias])
88running_mean = paddle.to_tensor(np.zeros([ 64], np.float32), stop_gradient= True)
89running_var = paddle.to_tensor(np.zeros([ 64], np.float32), stop_gradient= True)
90self.vars_bn.extend([running_mean, running_var])
91# ------------------------全连接层------------------------
92weight = paddle. static.create_parameter(shape=[ 64, n_way],
93dtype= 'float32',
94default_initializer=nn.initializer.XavierNormal,
95is_bias= False)
96bias = paddle. static.create_parameter(shape=[n_way],
97dtype= 'float32',
98is_bias= True)
99self.vars.extend([weight, bias])
100
101def forward(self, x, params=None, bn_training= True):
102ifparams isNone:
103params = self.vars
104weight, bias = params[ 0], params[ 1] # 第1个CONV层
105x = F.conv2d(x, weight, bias, stride= 1, padding= 1)
106weight, bias = params[ 2], params[ 3] # 第1个BN层
107running_mean, running_var = self.vars_bn[ 0], self.vars_bn[ 1]
108x = F.batch_norm(x, running_mean, running_var, weight=weight, bias=bias, training=bn_training)
109x = F.relu(x) # 第1个relu
110x = F.max_pool2d(x, kernel_size= 2) # 第1个MAX_POOL层
111weight, bias = params[ 4], params[ 5] # 第2个CONV层
112x = F.conv2d(x, weight, bias, stride= 1, padding= 1)
113weight, bias = params[ 6], params[ 7] # 第2个BN层
114running_mean, running_var = self.vars_bn[ 2], self.vars_bn[ 3]
115x = F.batch_norm(x, running_mean, running_var, weight=weight, bias=bias, training=bn_training)
116x = F.relu(x) # 第2个relu
117x = F.max_pool2d(x, kernel_size= 2) # 第2个MAX_POOL层
118weight, bias = params[ 8], params[ 9] # 第3个CONV层
119x = F.conv2d(x, weight, bias, stride= 1, padding= 1)
120weight, bias = params[ 10], params[ 11] # 第3个BN层
121running_mean, running_var = self.vars_bn[ 4], self.vars_bn[ 5]
122x = F.batch_norm(x, running_mean, running_var, weight=weight, bias=bias, training=bn_training)
123x = F.relu(x) # 第3个relu
124x = F.max_pool2d(x, kernel_size= 2) # 第3个MAX_POOL层
125weight, bias = params[ 12], params[ 13] # 第4个CONV层
126x = F.conv2d(x, weight, bias, stride= 1, padding= 1)
127weight, bias = params[ 14], params[ 15] # 第4个BN层
128running_mean, running_var = self.vars_bn[ 6], self.vars_bn[ 7]
129x = F.batch_norm(x, running_mean, running_var, weight=weight, bias=bias, training=bn_training)
130x = F.relu(x) # 第4个relu
131x = F.max_pool2d(x, kernel_size= 2) # 第4个MAX_POOL层
132x = paddle.reshape(x, [x.shape[ 0], -1]) ## flatten
133weight, bias = params[ -2], params[ -1] # linear
134x = F.linear(x, weight, bias)
135output = x
136returnoutput
137
138def parameters(self, include_sublayers= True):
139returnself.vars
元研读机类的编译机如下:
1classMetaLearner(nn.Layer):
2deforiginallyinitoriginally(self, n_way, glob_update_step, glob_update_step_test, glob_meta_lr, glob_base_lr):
3super(MetaLearner, self).originallyinit_ _
4self.update_step = glob_update_step # task-level inner update steps
5self.update_step_test = glob_update_step_test
6self.net = MAML(n_way=n_way)
7self.meta_lr = glob_meta_lr # 外可逆研读赴援
8self.base_lr = glob_base_lr # 内可逆研读赴援
9self.meta_optim = paddle.optimizer.Adam(learning_rate= self.meta_lr, parameters= self.net.parameters)
10
11defforward(self, x_spt, y_spt, x_qry, y_qry):
12task_num = x_spt.shape[ 0]
13query_size = x_qry.shape[ 1] # 75 = 15 * 5
14loss_list_qry = [ 0for_inrange( self.update_step + 1)]
15correct_list = [ 0for_inrange( self.update_step + 1)]
16
17# 内可逆径向手动备份,外可逆径向应以用于同上述好的备份机备份
18fori inrange(task_num):
19# 第0步备份
20y_hat = self.net(x_spt[i], params=None, bn_training=True) # (setsz, ways)
21loss = F.cross_entropy(y_hat, y_spt[i])
22grad = paddle.grad(loss, self.net.parameters) # 推算所有loss值得注意给定的径向和
23tuples = zip(grad, self.net.parameters) # 将径向和给定也就是说起来
24# fast_weights这一步大概求了一个 heta - alpha*abla(L)
25fast_weights = list(map(lambda p:p[ 1] - self.base_lr * p[ 0], tuples))
26# 在query集上试验,推算精准赴援
27# 这一步应以用于备份当年的信息,loss填入loss_list_qry[0],数据分析应以该有数填入correct_list[0]
28with paddle.no_grad:
29y_hat = self.net(x_qry[i], self.net.parameters, bn_training=True)
30loss_qry = F.cross_entropy(y_hat, y_qry[i])
31loss_list_qry[ 0] += loss_qry
32pred_qry = F.softmax(y_hat, axis= 1).argmax(axis= 1) # size = (75) # axis取-1也行
33correct = paddle.equal(pred_qry, y_qry[i]).numpy.sum.item
34correct_list[ 0] += correct
35# 应以用于备份后的信息在query集上试验。loss填入loss_list_qry[1],数据分析应以该有数填入correct_list[1]
36with paddle.no_grad:
37y_hat = self.net(x_qry[i], fast_weights, bn_training=True)
38loss_qry = F.cross_entropy(y_hat, y_qry[i])
39loss_list_qry[ 1] += loss_qry
40pred_qry = F.softmax(y_hat, axis= 1).argmax(axis= 1) # size = (75)
41correct = paddle.equal(pred_qry, y_qry[i]).numpy.sum.item
42correct_list[ 1] += correct
43
44# 剩余备份逐个
45fork inrange( 1, self.update_step):
46y_hat = self.net(x_spt[i], params=fast_weights, bn_training=True)
47loss = F.cross_entropy(y_hat, y_spt[i])
48grad = paddle.grad(loss, fast_weights)
49tuples = zip(grad, fast_weights)
50fast_weights = list(map(lambda p:p[ 1] - self.base_lr * p[ 0], tuples))
51
52ifk < self.update_step - 1:
53with paddle.no_grad:
54y_hat = self.net(x_qry[i], params=fast_weights, bn_training=True)
55loss_qry = F.cross_entropy(y_hat, y_qry[i])
56loss_list_qry[k + 1] += loss_qry
57else:# 对于终于一步update,要纪录loss推算的径向值,便于外可逆的径向的传播
58y_hat = self.net(x_qry[i], params=fast_weights, bn_training=True)
59loss_qry = F.cross_entropy(y_hat, y_qry[i])
60loss_list_qry[k + 1] += loss_qry
61
62with paddle.no_grad:
63pred_qry = F.softmax(y_hat, axis= 1).argmax(axis= 1)
64correct = paddle.equal(pred_qry, y_qry[i]).numpy.sum.item
65correct_list[k + 1] += correct
66
67loss_qry = loss_list_qry[- 1] / task_num # 推算终于一次loss的数值
68self.meta_optim.clear_grad # 径向清零
69loss_qry.backward
70self.meta_optim.step
71
72accs = np.array(correct_list) / (query_size * task_num) # 推算各备份逐个acc的数值
73loss = np.array(loss_list_qry) / task_num # 推算各备份逐个loss的数值
74returnaccs, loss
75
76deffinetunning(self, x_spt, y_spt, x_qry, y_qry):
77# assert len(x_spt.shape) == 4
78query_size = x_qry.shape[ 0]
79correct_list = [ 0for_inrange( self.update_step_test + 1)]
80
81new_net = deepcopy( self.net)
82y_hat = new_net(x_spt)
83loss = F.cross_entropy(y_hat, y_spt)
84grad = paddle.grad(loss, new_net.parameters)
85fast_weights = list(map(lambda p:p[ 1] - self.base_lr * p[ 0], zip(grad, new_net.parameters)))
86
87# 在query集上试验,推算精准赴援
88# 这一步应以用于备份当年的信息
89with paddle.no_grad:
90y_hat = new_net(x_qry, params=new_net.parameters, bn_training=True)
91pred_qry = F.softmax(y_hat, axis= 1).argmax(axis= 1) # size = (75)
92correct = paddle.equal(pred_qry, y_qry).numpy.sum.item
93correct_list[ 0] += correct
94
95# 应以用于备份后的信息在query集上试验。
96with paddle.no_grad:
97y_hat = new_net(x_qry, params=fast_weights, bn_training=True)
98pred_qry = F.softmax(y_hat, axis= 1).argmax(axis= 1) # size = (75)
99correct = paddle.equal(pred_qry, y_qry).numpy.sum.item
100correct_list[ 1] += correct
101
102fork inrange( 1, self.update_step_test):
103y_hat = new_net(x_spt, params=fast_weights, bn_training=True)
104loss = F.cross_entropy(y_hat, y_spt)
105grad = paddle.grad(loss, fast_weights)
106fast_weights = list(map(lambda p:p[ 1] - self.base_lr * p[ 0], zip(grad, fast_weights)))
107
108y_hat = new_net(x_qry, fast_weights, bn_training=True)
109
110with paddle.no_grad:
111pred_qry = F.softmax(y_hat, axis= 1).argmax(axis= 1)
112correct = paddle.equal(pred_qry, y_qry).numpy.sum.item
113correct_list[k + 1] += correct
114
115del new_net
116accs = np.array(correct_list) / query_size
117returnaccs
复现结果
本计划在Omniglot信息集上收尾了试验中的复现,其复现的结果如下同上上示意图:
小结
本文对小抽样研读课题的深入研究历史背景、方式论、特指信息集收尾了简述引介,中长期阐述了MAML元研读有数学模型的借助于方式、试验中的结果和不可或缺编译机。该有数学模型是讲义小抽样研读的捷径,也是分析报告新方式性能这两项的根基。与众不同并握有该经典有数学模型,将对更进一步的理论深入研究或实践运用于奠定典范。飞桨官方的小抽样研读工具包PaddleFSL现在都有了包含推算机视觉和自然语言处置运用于论题的小抽样研读应以付方案,如MAML,ProtoNet,Relation Net等等,是首个基于飞桨的小抽样研读工具包,爱戴大家重视并一起揭示。
供参考
[1] Vinyals O, Blundell C, Lillicrap T, et al. Matching Networks for One Shot Learning[J], 2016.
[2] Ravi S, Larochelle H. Optimization as a model for few-shot learning[J], 2016.
[3] Ren M, Triantafillou E, Ravi S, et al. Meta-learning for semi-supervised few-shot classification[J]. arXiv preprint arXiv:1803.00676, 2018.
[4] Oreshkin B N, Rodriguez P, Lacoste A. Tadam: Task dependent adaptive metric for improved few-shot learning[J]. arXiv preprint arXiv:1805.10123, 2018.
[5] Lake B, Salakhutdinov R, Gross J, et al. One shot learning of simple visual concepts[C]. Proceedings of the annual meeting of the cognitive science society, 2011.
[6] Wah C, Branson S, Welinder P, et al. The caltech-ucsd birds-200-2011 dataset[J], 2011.
[7] Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks[C]. International Conference on Machine Learning, 2017: 1126-1135.
END
觉得不错,请点个在看呀
。常州男科专科医院哪里好兰州白癜风医院哪里好
脚扭伤疼吃什么药物治疗
杭州最好的不孕不育医院是哪家
双错瑞因效果怎么样
- 05-12她23岁凭一部剧走红,丑闻被窃后退圈,今58岁娶34岁娇妻生赢
- 05-12备受瞩目恒大之后又一地产停牌,新政救楼市会是最后的狂欢么?
- 05-12新《鹿鼎记》张一山版韦小宝得道选角,苏大强变身海大富?值得看
- 05-12今日,三个信号凌空,A股,下周行情不寻常了?
- 05-12李靓蕾发不会上吊声明,已成警方密切注意对象,提前建立联系管道
- 05-12伯克格林管理模式的感悟
- 05-12未婚先孕的“京圈王妃”孙怡:16岁闯荡江湖,什么神仙爱情?
- 05-12阿里云人事大改变:中国区总裁辞任、多名高P离职、霍嘉统领架构师团队
- 05-12刘润对话飞天宋:营销,不是一下干件大事,而是做对一系列小事
- 05-12关于两会热点话题,外资企业这样真是……