准确性和损失会产生奇怪的结果

Question

我正在尝试创建一个神经网络来解决关于我说话的音频和其他人说话的音频的分类问题，以便对其进行分类。但是当我训练它时，它给了我这种奇怪的准确性和损失结果。

这是我的代码。

    '''
This is only to read the data and pass it into an array

1. Get the Audio data, my voice so we can visualize it into an array.
2. Build an ANN with the data already into an array. classification problem
3. Real time predictor using pyaudio and trained model
'''
from sklearn.model_selection import train_test_split
from tensorflow.python.keras.layers.core import Dropout
from sklearn.preprocessing import StandardScaler
import tensorflow as tf
import numpy as np
from scipy.io import wavfile
from pathlib import Path
import os

# cut audio to make the same sizes, shape and length
def trim_wav( originalWavPath, newWavPath , start, new ):
    '''
    :param originalWavPath: the path to the source wav file
    :param newWavPath: output wav file * can be same path as original
    :param start: time in seconds
    :param end: time in seconds
    :return:
    '''
    sampleRate, waveData = wavfile.read( originalWavPath )
    startSample = int( start * sampleRate )
    endSample = int( new * sampleRate )
    wavfile.write( newWavPath, sampleRate, waveData[startSample:endSample])


### DATASET 
pathlist = Path(os.path.abspath('Voiceclassification/Data/me/')).rglob('*.wav')

# My voice data
for path in pathlist:
    wp = str(path)
    # Trim function here for each file
    trim_wav(wp, wp.replace(".wav", ".wav"), 0,5)

    filename = str(path)

    # convert audio to numpy array and then 2D to 1D np Array
    samplerate, data = wavfile.read(filename)
    #print(f"sample rate: {samplerate}")
    #print(f"data: {data}")

pathlist2 = Path(os.path.abspath('Voiceclassification/Data/other/')).rglob('*.wav')

# other voice data
for path2 in pathlist2:
    wp2 = str(path2)
    trim_wav(wp2, wp2.replace(".wav", ".wav"), 0,5)

    filename2 = str(path2)

    samplerate2, data2 = wavfile.read(filename2)
    #print(data2)


### ADAPTING THE DATA FOR THE MODEL
X = data.reshape(-1, 1) # My voice
y = data2.reshape(-1, 1) # Other data
#print(X_.shape)
#print(y_.shape)

### Trainig the model
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=0)

# Performing future scaling
sc = StandardScaler()

x_train = sc.fit_transform(x_train)
x_test = sc.transform(x_test)

### Creating the ANN
ann = tf.keras.models.Sequential()

# First hidden layer of the ann
ann.add(tf.keras.layers.Dense(units=6, activation="relu"))
ann.add(Dropout(0.05))
# Second one
ann.add(tf.keras.layers.Dense(units=6, activation="relu"))
ann.add(Dropout(0.05))
# Output layer
ann.add(tf.keras.layers.Dense(units=1, activation="sigmoid"))

# Compile our neural network
ann.compile(optimizer="adam",
            loss="binary_crossentropy",
            metrics=['accuracy'])

# Fit ANN
ann.fit(x_train, y_train, batch_size=1024, epochs=100) ############ batch 32
ann.save('Models/voiceclassification.model')

有谁知道我的代码是否有任何问题导致 acc 非常低？

Answer 1

@MarkLavin，你的回答是正确的，我就是问这个问题的人，这是我的另一个真实账户，我尝试了@MarkLavin 告诉我做的事情并且它有效，我修改了我的代码这个。

### DATASET 
data = []
labels = []

audio_files = [f for f in glob.glob(os.path.abspath(r"Voiceclassification\Data")+"/**/*", recursive=True) if not os.path.isdir(f)]
random.shuffle(audio_files)

# My voice data
for path in audio_files:
    wp = str(path)
    # Trim function here for each file
    trim_wav(wp, wp.replace(".wav", ".wav"), 0,5)

    filename = str(path)

    # convert audio to numpy array and then 2D to 1D np Array
    samplerate, data_array = wavfile.read(filename)
    #print(f"data: {data}")

    data_array.reshape(-1, 1)

    data.append(data_array)

    label = path.split(os.path.sep)[-2]
    if label == "me":
        label = 1
    else:
        label = 0

    labels.append([label])


### ADAPTING THE DATA FOR THE MODEL
X = data # all voices data
y = np.array(labels) # data label 1 es me, 0 is other

y 的标签和 X 的所有随机数据，这是 100% 和 90% 准确率的结果，非常感谢@MarkLavin :)

准确性和损失会产生奇怪的结果

Accuracy and loss give weird results

python

machine-learning

neural-network

tensorflow