C Java Tutor 客服在線

代做國外C C++ Java程序 QQ: 1067665373 Email:cjtutor@foxmail.com

« File Compression

Pattern detection Python

 

Algorithm for locating a pattern in a data series

The goal is to detect whether a certain pattern appears in the data series. In the following example, the pattern to be detected is a sequence of 4 numbers (see Fig 2); and the data series contains 10 data points (see Fig 1).

data_series = [-1, 2, -2, 3, 41, 38, 22, 10, -1, 3] 
pattern = [40, 30, 20, 10]

Data Series Pattern

 

澳大利亚代写assignment,代写论文,essay代写推荐-cs小码神代写If you compare these two figures, you can see that this pattern appears between 5th (at index 4) to 8th (at index 7) data points of the data series. Note that it is not an exact match, but a fairly close one. Now you can spot the pattern using your eyes, let us see how an algorithm can do it.

Since the given pattern has 4 data points, we will take a segment of 4 consecutive data points from the data series at a time and compute a similarity measure. The similarity  measure should have the property that if a segment of the data series is similar to the given pattern, then the similarity measure of that segment is large, and vice versa. Hence, if we can compute the similarity measures of all possible segments, then we can identify the segment that is most similar to the given pattern. We will now provide more details.

The algorithm begins with computing the similarity measure between the given pattern and the first segment, which is formed by the first 4 data points of the data series. In terms of the two lists above, the similarity measure for the first segment is:

data_series[0]*pattern[0] + data_series[1]*pattern[1] + data_series[2]*pattern[2] + data_series[3]*pattern[3]

After this, we compute the similarity measure between the given pattern and the second segment, which is formed by the second to fifth data points of the data series. For the above example, the similarity measure for this segment is:

data_series[1]*pattern[0] + data_series[2]*pattern[1] + data_series[3]*pattern[2] + data_series[4]*pattern[3]

澳大利亚代写assignment,代写论文,essay代写推荐-cs小码神代写We then do the same with the segment formed by the third to sixth data points (the third segment), giving a similarity measure equals to:

data_series[2]*pattern[0] + data_series[3]*pattern[1] + data_series[4]*pattern[2] + data_series[5]*pattern[3]

We repeat this until we have computed the similarity measure for the last possible segment, which is formed by the 7th to 10th data points. The following is the similarity list for the above example and a plot:

similarity = [10, 490, 1210, 2330, 3320, 2370, 1190]

Similarity index

We need to consider the following cases.

Case 1澳大利亚代写assignment,代写论文,essay代写推荐-cs小码神代写: It is possible that the given data series is shorter than the given pattern, in this case, we return "Insufficient data".

Case 2: All the similarity measures are (strictly) less than the given threshold value澳大利亚代写assignment,代写论文,essay代写推荐-cs小码神代写. This means none of the segments is similar to the given pattern. In this case, we say the pattern is not present in the data series and return "Not detected". This is not the case for this example.

Case 3: At least one similarity measure is greater than or equal to the given threshold value. This is the case for this example. The similarity measure plot above shows that the fifth similarity measure is the largest. You can work out that the fifth similarity measure is computed by using the segment formed by the fifth (index 4) to eighth (index 7) data points of the data series. This procedure is therefore telling us that the segment consisting of fifth to eighth data points is most similar to the given pattern. Indeed this is what we had found by using visual inspection earlier. We will identify the location of the pattern by using the first index of the segment that has the highest similarity measure.

  • function pattern_search_max (described below): we return the index of the highest similarity measure that is also greater than or equal to the given threshold value. In Fig 3, the index of the highest similarity measure is 4.

     

  • function pattern_search_multiple (described below): Consider the following definition of overlapping indices,

    We say two indices are overlapping if the distance between them is less than the width of the pattern (4 in the above example).

    The function 'pattern_search_multiple' returns a list of non overlapping indices that are greater than or equal to the given threshold valueand that satisfy the following criteria:

    • an index is not selected if the value at the index is less than a value at one of it's overlapping indices.
    • an index is not selected if it is overlapping with first or last index.

    In the following example, selected indices are marked with green circles. Here, index 6 is not selected because the value at index 5 (distance of 1) is greater than the value at index 6. Similarly index 32 is not selected because the value at index 34 (distance 2) is higher. The example in Fig 4 will return the list [5, 16, 34].

    There are many different approaches to calculate similarity measures between data segments and a pattern, above we have discussed just one of them! Therefore, a given data series representing similarity measures may or may not澳大利亚代写assignment,代写论文,essay代写推荐-cs小码神代写 have clearly defined 'pyramid' patterns as shown in Fig 3. For example, there are no clearly defined 'pyramid' patterns in Fig 4.

    Similarity index

     

 

澳大利亚代写assignment,代写论文,essay代写推荐-cs小码神代写This completes the description of the tasks

Implementation requirements

You need to implement the following four functions, each in a separate file (provided). You need to submit these four files, each containing one function you implement.

1. def calculate_similarity(data_segment, pattern):

    • The aim of this function is to calculate the similarity measure between one data segment and the pattern.
    • The first parameter 'data_segment' is a list of (float) values, and the second parameter 'pattern' is also a list of (float) values. The function calculates the similarity measure between the given data segment and pattern and returns the calculated similarity value (float), as described earlier in this assigment. The function returns "Error" if 'data_segment' and 'pattern' have different lengths.
    • This function can be tested using the file 'test_calculate_similarity.py'.

2. def calculate_similarity_list(data_series, pattern):

    • The aim of this function is to calculate the similarity measures between all possible data segments and the pattern.
    • The first parameter 'data_series' is a list of (float) values, and the second parameter 'pattern' is also a list of (float) values. The function calculates the similarity measures, using the above function 'calculate_similarity', of all possible data segments in a sequence, and returns the list of calculated similarity values.
    • This function can be tested using the file 'test_calculate_similarity_list.py'.

3. def pattern_search_max(data_series, pattern, threshold):

    • The three possible outcomes are "Insufficient data", "Not detected" and the location of the pattern if the pattern is found in the data series. Here, the location of the pattern refers to the index of the highest similarity measure that is also greater than or equal to the given threshold value.
    • The first parameter 'data_series' is a list of (float) values, the second parameter 'pattern' is a list of (float) values, and the third parameter 'threshold' is a (float) value. In this function, you need to use the function 'calculate_similarity_list'.
    • This function can be tested using the file 'test_pattern_search_max.py'.

4. def pattern_search_multiple(data_series, pattern_width, threshold):

    • The three possible outcomes are "Insufficient data", "Not detected" and a list of non overlapping indices that are greater than or equal to the given threshold valueand that satisfy the following criteria:
      • an index is not selected if the value at the index is less than a value at one of it's overlapping indices.
      • an index is not selected if it is overlapping with first or last index.
      We say two indices are overlapping if the distance between them is less than the width of the pattern.
    • The first parameter 'data_series' is a list of (float) values, the second parameter 'pattern_width' is a (float) value, and the third parameter 'threshold' is a (float) value.
    • This function can be tested using the file 'test_pattern_search_multiple.py'.

 

  • 相關文章:

發表評論:

◎歡迎參與討論,請在這里發表您的看法、交流您的觀點。

最新評論及回復

最近發表

Powered By

Copyright 代寫C.

在線客服

售前咨詢
售后咨詢
微信號
Essay_Cheery
微信
悉尼assignment代写,北美作业代写,代写毕业论文-100%原创 北美代写,Homework代写,Essay代寫-准时✔️高质✔最【靠谱】 墨尔本assignment代写,代写毕业论文,paper代写-51作业君 北美代写,程序代做,程序代写,java代写,python代写,c++代写,c代写 英国代写paper,python代写,Report代写,编程代写-程序代写网 北美代写essay,程序代写,Java代写代做,Java代考-焦点论文 澳大利亚essay代写,编程代写,代码代写,程序代写-三洋编程 加拿大essay代写|程序代写代做||Python代写|Matlab代写-Meeloun 澳大利亚代写,代写essay,代写毕业论文,留学生代写-小马代写 日本代写,北美作业代写,新加坡代写,essay代写-无时差服务