Literal Odds and Ends

Post by **Montegriffo** » Tue Feb 27, 2018 12:01 am

I'm going outside now, I may be some time....

Post by **Montegriffo** » Tue Feb 27, 2018 1:47 am

Pah, lying fake news media.3 or4 inches of snow and you'd think it was "the day after tomorrow" with the fuss they are making.

Post by **Speaker to Animals** » Tue Feb 27, 2018 10:34 am

I got it running. Created 13 random subsets of the test data like the original research. Trained it on the first.

Code: Select all

untrained..
input: <0.0221, 0.0065, 0.0164, 0.0487, 0.0519, 0.0849, 0.0812, 0.1833, 0.2228, 0.181, 0.2549, 0.2984, 0.2624, 0.1893, 0.0668, 0.2666, 0.4274, 0.6291, 0.7782, 0.7686, 0.8099, 0.8493, 0.944, 0.945, 0.9655, 0.8045, 0.4969, 0.396, 0.3856, 0.5574, 0.7309, 0.8549, 0.9425, 0.8726, 0.6673, 0.4694, 0.1546, 0.1748, 0.3607, 0.5208, 0.5177, 0.3702, 0.224, 0.0816, 0.0395, 0.0785, 0.1052, 0.1034, 0.0764, 0.0216, 0.0167, 0.0089, 0.0051, 0.0015, 0.0075, 0.0058, 0.0016, 0.007, 0.0074, 0.0038>
desired output: <1.0>
actual output: 0.635258000850496

Trained..
input: <0.0221, 0.0065, 0.0164, 0.0487, 0.0519, 0.0849, 0.0812, 0.1833, 0.2228, 0.181, 0.2549, 0.2984, 0.2624, 0.1893, 0.0668, 0.2666, 0.4274, 0.6291, 0.7782, 0.7686, 0.8099, 0.8493, 0.944, 0.945, 0.9655, 0.8045, 0.4969, 0.396, 0.3856, 0.5574, 0.7309, 0.8549, 0.9425, 0.8726, 0.6673, 0.4694, 0.1546, 0.1748, 0.3607, 0.5208, 0.5177, 0.3702, 0.224, 0.0816, 0.0395, 0.0785, 0.1052, 0.1034, 0.0764, 0.0216, 0.0167, 0.0089, 0.0051, 0.0015, 0.0075, 0.0058, 0.0016, 0.007, 0.0074, 0.0038>
desired output: <1.0>
actual output: 0.987832214713394

It's actually a pretty hard problem because it does NOT converge on a low loss function threshold but just halts when it reaches my max epochs parameter. Still, I just threw in random samples from one of the sets I did not use for training and I get a good match each time.

I think if I trained it for about 10,000 epochs it would be much closer to a match than off by like 2%.

I picked a slightly different input and hidden layer size than the original and got much better results than they did.

In the training data, a desired output of 1.0 corresponds to a mine, and -1.0 corresponds to a rock. So the sonar data in the input vector above came from a metal cylinder in the sonar experiments. The initial random weights in the network put that at no match. After training, it was a pretty solid match to a mine, which was the desired response.

Post by **Speaker to Animals** » Tue Feb 27, 2018 10:45 am

The fact that my training algorithm did not converge on a minima tells me it's quite unlikely you can easily extract features from the data using SQL side effects.

Most of the training sets people use for educational purposes will just reach a low loss function threshold in a few hundred epochs at most. This thing can run for ten thousand epochs and not converge.

Post by **SuburbanFarmer** » Tue Feb 27, 2018 11:01 am

What is your program looking at? Curves, smoothness, or something else?

Post by **Speaker to Animals** » Tue Feb 27, 2018 11:21 am

GrumpyCatFace wrote:What is your program looking at? Curves, smoothness, or something else?

Backpropagation algorithms look at the error space and converge on a local minima. Imagine the error space as a three-dimensional graph with a kind of funnel shape. That's the search space for error in the network outputs. You can traverse the error surface via a kind of gradient search (using partial differentiation of the activation function).

It automatically adjusts for some subtle features. A huge subfield in deep learning is trying to extract those features from trained networks.

For it to work, the activation functions of the neural nodes have to be differentiable, though.

I was thinking about trying a different algorithm (adaline/madaline) as well, but the fact that sigmoids couldn't easily converge on the minima tells me there are a lot of features in that data. To do it right probably requires multiple hidden layers trained to identify specific features in the data.

https://en.wikipedia.org/wiki/Backpropagation

Post by **Speaker to Animals** » Tue Feb 27, 2018 11:28 am

Also, I don't really understand the sonar data, so I don't really know what exactly the network is learning. That's the cool thing about deep learning, but also the difficulty in that you very well might not understand the domain very well, so troubleshooting your network can be a bitch.

Post by **SuburbanFarmer** » Tue Feb 27, 2018 11:37 am

Speaker to Animals wrote:
GrumpyCatFace wrote:What is your program looking at? Curves, smoothness, or something else?

Backpropagation algorithms look at the error space and converge on a local minima. Imagine the error space as a three-dimensional graph with a kind of funnel shape. That's the search space for error in the network outputs. You can traverse the error surface via a kind of gradient search (using partial differentiation of the activation function).

It automatically adjusts for some subtle features. A huge subfield in deep learning is trying to extract those features from trained networks.

For it to work, the activation functions of the neural nodes have to be differentiable, though.

I was thinking about trying a different algorithm (adaline/madaline) as well, but the fact that sigmoids couldn't easily converge on the minima tells me there are a lot of features in that data. To do it right probably requires multiple hidden layers trained to identify specific features in the data.

https://en.wikipedia.org/wiki/Backpropagation

LOL ok.

I think it's "smoothness".

Post by **Speaker to Animals** » Tue Feb 27, 2018 11:39 am

No. It is a gradient descent on the error space. The features that the network learns are not at all obvious.

edit: oh, you mean smoothness of the sonar target? Maybe. I don't know how to interpret the data so I have no idea what the network is learning. I suspect it's a lot more than just one feature since it does not quickly converge on a local minima.

The original research paper indicates that aspect angle plays a big part in the problem. It has to be incorporating aspect angles into it's learning no matter what. So if it also looks at smoothness, you have two features to learn.

Post by **SuburbanFarmer** » Tue Feb 27, 2018 11:45 am

Speaker to Animals wrote:No. It is a gradient descent on the error space. The features that the network learns are not at all obvious.

edit: oh, you mean smoothness of the sonar target? Maybe. I don't know how to interpret the data so I have no idea what the network is learning. I suspect it's a lot more than just one feature since it does not quickly converge on a local minima.

The original research paper indicates that aspect angle plays a big part in the problem. It has to be incorporating aspect angles into it's learning no matter what. So if it also looks at smoothness, you have two features to learn.

Well, from any angle, you will see smoothness vs. a rock.

It sounds like it's comparing the 'jaggedness' of the image.

Literal Odds and Ends

Re: Literal Odds and Ends

Re: Literal Odds and Ends

Re: Literal Odds and Ends

Re: Literal Odds and Ends

Re: Literal Odds and Ends

Re: Literal Odds and Ends

Re: Literal Odds and Ends

Re: Literal Odds and Ends

Re: Literal Odds and Ends

Re: Literal Odds and Ends