电子人社区›论坛 › 基础技术 › 算法技术 › 深度学习算法 › [DeepLearning] 深度学习框架Caffe初体验之病斑检测 ...

深度学习算法

今日 : 0|主题 : 272|排名 : 294

发帖回复收藏

[DeepLearning] 深度学习框架Caffe初体验之病斑检测

发表于 2016-5-10 13:46:49 | 559440 只看该作者回帖奖励

|倒序浏览 |阅读模式

[复制链接]

发表于 2016-5-10 13:46:49 | 只看该作者回帖奖励

|倒序浏览 |阅读模式

电子人社区网讯： 0 引言
　　Caffe（http://caffe.berkeleyvision.org/）是一个清晰而高效的深度学习框架，其作者是博士毕业于UC Berkeley的贾扬清（http://daggerfs.com/），caffe是他研究生时期写的，目前由Berkeley继续维护。
　　本人最近正在研究利用DeepLearning进行图像分类，正好caffe中有ImageNet的例子可以修改利用。本文主要介绍：用到的数据集、caffe网络文件修改、数据集处理、 model准确率验证、病斑检测
1. 数据集介绍
　　我的数据集来自于一组带有病斑的叶子图片。图片主要为以下三类：

第一种是整张带有病斑的图片

第二种是提取的病斑窗口图片

第三种是非病斑图片，主要是一些背景的窗口图片
2. caffe训练
　　为了可以很好地识别病斑与非病斑，我这次训练使用的正负样本分别为上面的第二类和第三类数据，也就是病斑窗口和非病斑窗口。
　　2.1 网络结构设置
在$CaffeHome/examples下有一些关于深度学习的例子，我这里选取的参考例子是cifar10，这是一个用于图片分类的深度学习网络.(http://www.cs.toronto.edu/~kriz/cifar.html)
我们需要用到的文件有几个：leaf_full.prototxt、leaf_full_solver.prototxt、leaf_full_train_test.prototxt、train_full.sh  但是我们需要对其做相应的修改。

文件：leaf_full.prototxt
name: "CIFAR10_full_deploy"
# N.B. input image must be in CIFAR-10 format
# as described at http://www.cs.toronto.edu/~kriz/cifar.html
input: "data"
input_shape {
dim: 1
dim: 3
dim: 32
dim: 32
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 32
pad: 2
kernel_size: 5
stride: 1
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "pool1"
top: "pool1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 3
alpha: 5e-05
beta: 0.75
norm_region: WITHIN_CHANNEL
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 32
pad: 2
kernel_size: 5
stride: 1
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: AVE
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 3
alpha: 5e-05
beta: 0.75
norm_region: WITHIN_CHANNEL
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
stride: 1
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "pool3"
type: "Pooling"
bottom: "conv3"
top: "pool3"
pooling_param {
pool: AVE
kernel_size: 3
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool3"
top: "ip1"
param {
lr_mult: 1
decay_mult: 250
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 2
}
}
layer {
name: "prob"
type: "Softmax"
bottom: "ip1"
top: "prob"
}
　　leaf_full.prototxt是网络定义文件，由于只是初步的对caffe进行了解学习，所以对于网络结构的设计我依旧沿用cifar10原来的网络结构。但是在这里有几个地方需要修改一下。首先在开头的

input_shape {
dim: 1
dim: 3
dim: 32
dim: 32

}

第一个表示每次读入的照片数，第二个表示图片的通道数，我这里用的是彩色图片，所以是3，最后两个是图片的大小，我输入的32X32图片。另外对于灰度图片，我们需要提供一个binaryprototxt类型的均值文件，文件的生成可以借助caffe提供的工具文件，在我的github代码中也可以看到这个文件。
　　另外在结尾中，我们必须选择num_output的数目，这个变量可以理解为我们进行的是几分类，我这里是二分类，所以num_output是2。与此同时，在leaf_full_train_test.prototxt中同样需要进行修改。

  inner_product_param {
num_output: 2
}
　　leaf_full_solver.prototxt是用于设置学习过程中的一些参数，比如学习率、迭代次数等等，可以根据具体情况来设置。
　　2.2 数据预处理
在caffe中进行训练时，数据的输入可以是lmdb格式或者leveldb格式，在这里我选择的是lmdb格式，虽然lmdb的内存消耗是leveldb的1.1倍，但是lmdb的速度比leveldb快10%至15%，更重要的是lmdb允许多种训练模型同时读取同一组数据集。同样caffe也提供了转换的小工具，放在了$CaffeHome/tools下。
另外我们还需要提供label文件，具体格式如下图：

　　同样我在我的代码中也提供了binaryprototxt和npy数据文件的转化，npy数据文件是在生成了model后，用于预测单个或者多个图片。
　　我使用了自己的脚本文件对整个数据预处理过程进行了处理，在DataDir下可以多个类别的图片，文件夹命名依次为1、2…n，脚本可以调用multiDataProcessing.py，根据训练测试比（trainingRate，这个可以在脚本中设定，从而控制train与val数据的数量）对各个类别的图片进行划分，并生成label文件。然后继续调用其他文件生成需要的数据文件。

另外特别提醒注意输入的图片大小，我这里是32X32，在create_leaf.sh中可以选择resize选项，对图片进行resize，而不用自己处理。当然我也提供了resize.py的Python脚本预处理图片大小。

文件：leaf.sh
#!/usr/bin/env sh
#processing the data of training
HomeDir=~/DeepLearning/caffe/examples/leaf
TOOLS=~/DeepLearning/caffe/build/tools
DataDir=~/DeepLearning/DataProcessing/isLeaf
TrainingDir=$HomeDir/leaf_train_data
ValuatingDir=$HomeDir/leaf_val_data
LabelOfPos=$HomeDir/train_Label.txt
LabelOfNeg=$HomeDir/val_Label.txt
trainingRate=0.7
echo "Starting Pre-processing the training data..."
if [ -d $HomeDir/leaf_train_data ]; then
rm -rf $HomeDir/leaf_train_data
echo "Old leaf_train_data is removed!"
fi
if [ -d $HomeDir/leaf_val_data ]; then
rm -rf $HomeDir/leaf_val_data
echo "Old leaf_val_data is removed!"
fi
python multiDataProcessing.py $DataDir \
$TrainingDir $ValuatingDir  \
$LabelOfPos $LabelOfNeg  \
$trainingRate
echo "Pre-processing is finished!"
echo "Starting creating the input data..."
sh $HomeDir/create_leaf.sh
echo "Creating data is finished!"
echo "Starting making binaryproto..."
if [ -f $HomeDir/leaf_mean.binaryproto ]; then
rm $HomeDir/leaf_mean.binaryproto
echo "old leaf_mean.binaryproto is removed!"
fi
$TOOLS/compute_image_mean $HomeDir/leaf_train_lmdb \
$HomeDir/leaf_mean.binaryproto
echo "making binaryproto is finished!"
echo "Starting making data of npy..."
if [ -f $HomeDir/leaf_mean.npy ]; then
rm $HomeDir/leaf_mean.npy
fi
python $HomeDir/convert_protomean.py $HomeDir/leaf_mean.binaryproto \
$HomeDir/leaf_mean.npy
if [ -f $HomeDir/leaf_mean.npy ]; then
echo "leaf_mean.npy is finished!"
fi
echo "All completed!"

文件：multiDataProcessing.py
import os
import sys
import random
import shutil
import copy
from PIL import Image
def main(argv):
classList = os.listdir(argv[1])
classNum = len(classList)
trainList = []
valList = []
# make train&val dir
os.mkdir(argv[2])
os.mkdir(argv[3])
#rename option
for item in classList:
count = 0
imageDir = argv[1] + '/' + item
imageList = os.listdir(imageDir)
# shuffle the image list
random.shuffle(imageList)
# divide data into training and valuating
for im in imageList:
src = imageDir + '/' + im
#image_name = item + '_' + im
image_name = item + '_' + str(count) + '.jpg'
# training data
trainNum = int(float(argv[6]) * len(imageList))
if count < trainNum:
dst = argv[2] + '/' + image_name
shutil.copy(src, dst)
train = [image_name, item]
trainList.append(copy.copy(train))
count += 1
# valuating data
else:
dst = argv[3] + '/' + image_name
shutil.copy(src, dst)
val = [image_name, item]
valList.append(copy.copy(val))
count += 1
# make label txt
with open(argv[4], 'w') as file:
for item in trainList:
file.write(item[0] + ' ' + item[1] + '\n')
with open(argv[5], 'w') as file:
for item in valList:
file.write(item[0] + ' ' + item[1] + '\n')
if __name__ == '__main__':
main(sys.argv)
　　2.3 模型训练
完成了数据预处理之后，可以运行train_full.sh进行训练，训练过程中会每隔一段迭代会输出结果，比如准确率等等，这些参数都可以在leaf_full_solver.prototxt中进行设置。

由于是二分类的训练，很快就可以得到99%以上的准确率。
3 数据预测验证
　　如果对caffe得到的结果有问题，可以用caffe提供的Python脚本对单个或者几个文件进行预测，得到结果

首先对单个文件进行预测

我输入的图片是一个正样本，也就是一个病斑窗口图片（我对负样本的标记是0，正样本的标记是1），从图上可以看到，caffe输出了两个结果，第一个代表的是属于0类的概率，第二个代表的是属于1类的概率，可以发现我们训练出的模型认为这张图片属于1类的概率是接近1，与我们的正确结果是吻合的。
　　另外我还对大量图片进行了预测，判断准确率：

文件：classify.py（关键代码）
image_dims = [int(s) for s in args.images_dim.split(',')]
mean, channel_swap = None, None
if args.mean_file:
mean = np.load(args.mean_file)
if args.channel_swap:
channel_swap = [int(s) for s in args.channel_swap.split(',')]
caffe.set_mode_gpu()
# Make classifier.
classifier = caffe.Classifier(args.model_def, args.pretrained_model,
image_dims=image_dims, mean=mean,
input_scale=args.input_scale, raw_scale=args.raw_scale,
channel_swap=channel_swap)
# Read label file
with open(args.label_file, 'r') as file:
label = []
image_list = []
for line in file.readlines():
temp = line.strip('\n').split(' ')
image_list.append(args.input_file + '/' + temp[0])
label.append(temp[1])
# Load numpy array (.npy), directory glob (*.jpg), or image file.
args.input_file = os.path.expanduser(args.input_file)
if args.input_file.endswith('npy'):
print("Loading file: %s" % args.input_file)
inputs = np.load(args.input_file)
elif os.path.isdir(args.input_file):
print("Loading folder: %s" % args.input_file)
inputs =[caffe.io.load_image(im_f)
for im_f in image_list]
else:
print("Loading file: %s" % args.input_file)
inputs = [caffe.io.load_image(args.input_file)]
print("Classifying %d inputs." % len(inputs))
iteration = 10000
start = time.time()
correct = 0.0
wrong = 0.0
for num in range(iteration):
# Classify.
start = time.time()
print len(inputs[1]), np.shape(inputs[1])
predictions = classifier.predict(inputs, not args.center_only)
print("Iteration " + str(num) + " in %.2f s." % (time.time() - start))
print(' ')
i = 0
count = 0
for output in predictions:
answer = '3'
if (float(output[0]) > float(output[1])):
answer = '0'
else:
answer = '1'
if (answer == label):
correct = correct + 1
else:
wrong = wrong + 1
print output, image_list, label
count = count + 1
i += 1
print count
print("Accuracy rate is %.4f" % (correct/(correct+wrong)))
print(' ')

4 病斑识别
　　当我们得到病斑的预测model之后，可以利用滑动窗口算法对整张图片进行检测，从而寻找到图上所有的病斑。当然滑动窗口算法的速度很慢，而且窗口大小，滑动步长等参数对结果的影响也比较大。不过由于检测并不是我在这方面的主要研究方向，我更希望研究的是需找到更好的网络来应用到合适的场景，所以这里我仅简单利用滑动窗口算法。

具体代码如下：

文件：detect.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import numpy as np
import copy
import shutil
import os
import sys
import argparse
import glob
import time
import cv2
import caffe
threshold = 0.95
window_threshold = 150
areaThreshold = 0.8
initial_window_scale = 32
scale_step = 10
height_step = 8
width_step = 8
save_Path='/home/ganlinhao/DeepLearning/caffe/examples/leaf/Image'
def main(argv):
pycaffe_dir = os.path.dirname(__file__)
parser = argparse.ArgumentParser()
# Required arguments: input and output files.
parser.add_argument(
"input_file",
help="Input image, directory, or npy."
)
# Optional arguments.
parser.add_argument(
"--model_def",
default=os.path.join(pycaffe_dir,
"leaf_full.prototxt"),
help="Model definition file."
)
parser.add_argument(
"--pretrained_model",
default=os.path.join(pycaffe_dir,
"leaf_full_iter_40000.caffemodel"),
help="Trained model weights file."
)
parser.add_argument(
"--gpu",
action='store_true',
help="Switch for gpu computation."
)
parser.add_argument(
"--center_only",
action='store_true',
help="Switch for prediction from center crop alone instead of " +
"averaging predictions across crops (default)."
)
parser.add_argument(
"--images_dim",
default='32,32',
help="Canonical 'height,width' dimensions of input images."
)
parser.add_argument(
"--mean_file",
default=os.path.join(pycaffe_dir,
'leaf_mean.npy'),
help="Data set image mean of [Channels x Height x Width] dimensions " +
"(numpy array). Set to '' for no mean subtraction."
)
parser.add_argument(
"--input_scale",
type=float,
help="Multiply input features by this scale to finish preprocessing."
)
parser.add_argument(
"--raw_scale",
type=float,
default=255.0,
help="Multiply raw input by this scale before preprocessing."
)
parser.add_argument(
"--channel_swap",
default='2,1,0',
help="Order to permute input channels. The default converts " +
"RGB -> BGR since BGR is the Caffe default by way of OpenCV."
)
parser.add_argument(
"--ext",
default='jpg',
help="Image file extension to take as input when a directory " +
"is given as the input file."
)
args = parser.parse_args()
image_dims = [int(s) for s in args.images_dim.split(',')]
mean, channel_swap = None, None
if args.mean_file:
mean = np.load(args.mean_file)
if args.channel_swap:
channel_swap = [int(s) for s in args.channel_swap.split(',')]
caffe.set_device(1)
caffe.set_mode_gpu()
# Make classifier.
classifier = caffe.Classifier(args.model_def, args.pretrained_model,
image_dims=image_dims, mean=mean,
input_scale=args.input_scale, raw_scale=args.raw_scale,
channel_swap=channel_swap)
# Load numpy array (.npy), directory glob (*.jpg), or image file.
args.input_file = os.path.expanduser(args.input_file)
if os.path.isdir(args.input_file):
print("Loading folder: %s" % args.input_file)
inputs = glob.glob(args.input_file + '/*')
else:
print("Loading file: %s" % args.input_file)
inputs = glob.glob(args.input_file)
print("Loaded %d inputs." % len(inputs))
if os.path.exists(save_Path):
shutil.rmtree(save_Path)
os.mkdir(save_Path)
start = time.time()
im_num = 0
for image in inputs:
im = caffe.io.load_image(image)
window_scale = initial_window_scale
detections = []
num = 0
while(window_scale < window_threshold):
# ymin xmin ymax xmax
window=[0, 0, window_scale, window_scale]
flag = True
while(flag == True):
# Crop image
crop = im[window[0]:window[2], window[1]:window[3]]
imgs = caffe.io.resize_image(crop, (32, 32))
imgs_list = []
imgs_list.append(imgs)
# predict
prediction = classifier.predict(imgs_list, not args.center_only)
# chose the rectangle window
if prediction[0][1] > threshold:
isAdd = True
if len(detections) == 0:
detections.append(copy.copy(window))
else:
for win in detections:
isOverlap, window = merge_rectangle(win, window)
if isOverlap:
detections.remove(win)
result = copy.copy(window)
isAdd = False
if isAdd:
detections.append(copy.copy(window))
else:
detections.append(copy.copy(result))
num += 1
window, flag = move_window(window, np.shape(im), window_scale)
window_scale = window_scale + scale_step
# recheck the overlap rectangle
remove = []
for i in range(0, len(detections)):
for j in range(i+1, len(detections)):
isRepeated, newWin = merge_rectangle(detections, detections[j])
if isRepeated:
remove.append(copy.copy(detections))
for win in remove:
if win in detections:
detections.remove(win)
# draw the rectangles
draw_rectangle(detections, image)
print ('the No.' + str(im_num) + ' image is completed:')
print (' ' + str(len(detections)) + ' windows are detected!')
im_num += 1

def move_window(window, shape, window_scale):
flag = True
if (window[3] + (width_step)) < shape[1]:
#window[1] = window[1] + (window_scale/2)
window[1] = window[1] + width_step
#window[3] = window[3] + (window_scale/2)
window[3] = window[3] + width_step
else:
window[1] = 0
window[3] = window_scale
if (window[2]+ (height_step)) < shape[0]:
window[0] = window[0] + height_step
#window[0] = window[0] + (window_scale/2)
window[2] = window[2] + height_step
#          window[2] = window[2] + (window_scale/2)
else:
flag = False
return window, flag

def draw_rectangle(detectedList, image):
# read the image
im = cv2.imread(image)
if not len(detectedList) == 0:
# draw the rectangle
for window in detectedList:
cv2.rectangle(im, (window[1], window[2]), (window[3], window[0]), (0,0,255), 5)
# save the image
cv2.imwrite(save_Path + '/' + image.split('/')[len(image.split('/'))-1], im)

def merge_rectangle(window1, window2):
isOverlap = False
result = copy.copy(window2)
# find the overlap area
if not ((window1[1] > window2[3]) | (window2[1] > window1[3]) | (window1[0] > window2[2]) | (window2[0] > window1[2])):
for i in range(4):
if window1 > window2:
result = window1
else:
result = window2
# compute the area of rectangles
overlapArea = float((result[2] - result[0]) * (result[3] - result[1]))
Area1 = float((window1[2] - window1[0]) * (window1[3] - window1[1]))
Area2 = float((window2[2] - window2[0]) * (window2[3] - window2[1]))
# select the window
if ((overlapArea/Area1) > areaThreshold) | ((overlapArea/Area2) > areaThreshold):
if ((window1[3]-window1[1]) * (window1[2]-window1[0]) > (window2[3]-window2[1]) * (window2[2]-window2[0])):
for j in range(4):
result[j] = copy.copy(window1[j])
else:
for j in range(4):
result[j] = copy.copy(window2[j])
isOverlap = True
return isOverlap, result
if __name__ == '__main__':
main(sys.argv)
　　效果展示：

5 讨论
　　（1） caffe是基于C++, 所以对于网络的训练速度比较快，但是我暂时还么有看到caffe提供网络中间结果的查看，因此我觉得在训练的时候可以选择caffe，上手很快。但是如果要对网络进行研究，那么可能需要修改caffe或者用其他框架辅助
　　（2）滑动窗口算法中各个参数对检测的结果影响比较大，当然会影响到算法的运行时间
　　（3）关于检测是比较简单的，更深一层的应该是研究如何设计网络来识别不同类型的病斑
　　（4）在开始检测的过程中，出现了大量的误判，我后来通过分析发现是因为有些背景没有作为负样本放入训练，所以我增加了负样本的图片后重新训练，准确率大大提高。
6 代码
　　我的代码还没整理，回家后会整理提供在我的github主页中，在使用时请注意修改好路径。

https://github.com/GumpCode/leaf
　　对于深度学习的工作有任何进展，我会在其他博文中展示，此博文为原创，转载请注明：

http://blog.csdn.net/GumpCode/article/details/50561008

来源：CSDN

                                                   转载请注明：电子人社区

收藏0 分享 赞同0 反对0

0条回复

电梯直达

返回列表

发新帖

发回帖

		自动登录	找回密码
密码			立即注册

	搜索
热门搜索: 活动物联网电子人大数据人工智能智能工业

深度学习算法

[DeepLearning] 深度学习框架Caffe初体验之病斑检测

发表回复