首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >基于Azure机器学习的Keras模型训练

基于Azure机器学习的Keras模型训练
EN

Stack Overflow用户
提问于 2021-08-12 20:25:41
回答 1查看 297关注 0票数 0

我在本地使用Keras训练了一个多类分类模型。我试图迁移它,以便它可以在Azure (AML)中进行培训和运行。

我提供了下面用于反洗钱的代码部分--主要的反洗钱代码和训练模型(EnsemblingModel.py)的脚本。从主AML代码中,通过src =(脚本运行Config)调用训练模型的脚本。

请注意,我也上传了数据集,模型应该直接培训到反洗钱,并被命名为'test_data'

但是,当从主反洗钱代码部分执行行RunDetails(run).show()时,将返回错误。错误是:

代码语言:javascript
复制
Error occurred: User program failed with FileNotFoundError: [Errno 2] No such file or directory: 'test_data'

此错误消息引用EnsemblingModel.py脚本中的以下行:

代码语言:javascript
复制
dataframe = pd.read_csv("test_data", header=None)

我理解脚本无法加载数据,因此我尝试更改代码,例如:

代码语言:javascript
复制
dataframe = dataset.get_by_name(ws, name='test_data')

返回以下错误:

代码语言:javascript
复制
Error occurred: User program failed with NameError: name 'dataset' is not defined

我如何改变这一点,以便脚本能够读取和加载数据,以便培训可以开始?也许我这样做是完全错误的,所以任何建议都是受欢迎的。

我已经查阅了各种微软文档以及Github指南这里,但似乎只有有限的例子。

我是AML的新手,所以如果有人有任何资源可以和Keras一起使用,那我们也会很感激。

主要反洗钱法规:

代码语言:javascript
复制
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import os
import azureml
from azureml.core import Experiment
from azureml.core import Environment
from azureml.core import Dataset
from azureml.core import Workspace, Run
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException

ws = Workspace.from_config()
print('Workspace name: ' + ws.name, 
      'Azure region: ' + ws.location, 
      'Subscription id: ' + ws.subscription_id, 
      'Resource group: ' + ws.resource_group, sep='\n')

from azureml.core import Experiment

script_folder = './TestingModel1'
os.makedirs(script_folder, exist_ok=True)

exp = Experiment(workspace=ws, name='TestingModel1')

dataset = Dataset.get_by_name(ws, name='test_data')
dataframe = dataset.to_pandas_dataframe()
df = dataframe.values


cluster_name = "cpu-cluster"

try:
    compute_target = ComputeTarget(workspace=ws, name=cluster_name)
    print('Found existing compute target')
except ComputeTargetException:
    print('Creating a new compute target...')
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6',
                                                           max_nodes=4)

    compute_target = ComputeTarget.create(ws, cluster_name, compute_config)

    compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)

compute_targets = ws.compute_targets
for name, ct in compute_targets.items():
    print(name, ct.type, ct.provisioning_state)

from azureml.core import Environment

keras_env = Environment.from_conda_specification(name = 'keras-2.3.1', file_path = './conda_dependencies.yml')

# Specify a GPU base image
#keras_env.docker.enabled = True
keras_env.docker.base_image = 'mcr.microsoft.com/azureml/openmpi3.1.2-cuda10.0-cudnn7-ubuntu18.04'

from azureml.core import ScriptRunConfig

src = ScriptRunConfig(source_directory=script_folder,
                      script='EnsemblingModel.py',
                      compute_target=compute_target,
                      environment=keras_env)

run = exp.submit(src)

from azureml.widgets import RunDetails
RunDetails(run).show()

加入示范守则:

代码语言:javascript
复制
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

#KerasLibraries
from keras import callbacks
from keras.layers.normalization import BatchNormalization
from keras.layers import Activation
from keras.layers import Dropout
from keras.optimizers import SGD
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from keras.utils import np_utils

#tensorFlow
import tensorflow as tf

#SKLearnLibraries
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import LabelEncoder
from sklearn.pipeline import Pipeline
from azureml.core import Run

# In[3]:

dataframe = pd.read_csv("test_data", header=None)
dataframe = dataset.get_by_name(ws, name='test_data')
dataset = dataframe.values


# In[4]:


X = dataset[:,0:22].astype(float)
y = dataset[:,22]

# encode class values as integers
encoder = LabelEncoder()
encoder.fit(y)
encoded_y = encoder.transform(y)
# convert integers to dummy variables (i.e. one hot encoded)
dummy_y = np_utils.to_categorical(encoded_y)

print(dummy_y.shape)
#print(X.shape)
#print(X)
import sys
np.set_printoptions(threshold=sys.maxsize)
dummy_y_new = dummy_y[0:42,:]

print(dummy_y_new)
#dataset


# In[5]:


earlystopping = callbacks.EarlyStopping(monitor ="val_loss", 
                                        mode ="min", patience = 125, 
                                        restore_best_weights = True)
  
#define Keras
model1 = Sequential()
model1.add(Dense(50, input_dim=22))
model1.add(BatchNormalization())
model1.add(Activation('relu'))
model1.add(Dropout(0.5,input_shape=(50,)))
model1.add(Dense(50))
model1.add(BatchNormalization())
model1.add(Activation('relu'))
model1.add(Dropout(0.5,input_shape=(50,)))
model1.add(Dense(8, activation='softmax'))

#compile the keras model

model1.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])   

# fit the keras model on the dataset
model1.fit(X, dummy_y, validation_split=0.25, epochs=10000, batch_size=100, verbose=1, callbacks=[earlystopping])


_, accuracy3 = model1.evaluate(X, dummy_y, verbose=0)

print('Accuracy: %.2f' % (accuracy3*100))

    
predict_dataset = tf.convert_to_tensor([
            [1,5,1,0.459,0.322,0.041,0.002,0.103,0.032,0.041,14,0.404,0.284,0.052,0.008,0.128,0.044,0.037,0.043,54,0,155],
])


predictions = model1(predict_dataset, training=False)
   
predictions2 = predictions.numpy()
print(predictions2)
print(type(predictions2))
EN

回答 1

Stack Overflow用户

发布于 2021-08-13 10:16:25

我通过在ScriptRunConfig代码中添加一个参数来解决上述问题:

代码语言:javascript
复制
test_data_ds = Dataset.get_by_name(ws, name='test_data')

src = ScriptRunConfig(source_directory=script_folder,
                      script='EnsemblingModel.py',
                      # pass dataset as an input with friendly name 'titanic'
                      arguments=['--input-data', test_data_ds.as_named_input('test_data')],
                      compute_target=compute_target,
                      environment=keras_env)

以及建模脚本本身的以下内容:

代码语言:javascript
复制
import argparse
from azureml.core import Dataset, Run

parser = argparse.ArgumentParser()
parser.add_argument("--input-data", type=str)
args = parser.parse_args()

run = Run.get_context()
ws = run.experiment.workspace

# get the input dataset by ID
dataset = Dataset.get_by_id(ws, id=args.input_data)

# load the TabularDataset to pandas DataFrame
df = dataset.to_pandas_dataframe()
dataset = df.values

对于任何好奇的人,可以找到更多的信息这里

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/68763774

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档