A parallel searching algorithm for the insetting procedure in Matlab Parallel Toolbox

 Werbung

 4 views
of 7
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Description
""In this paper we present the implementation of a parallel searching algorithm, which is used for the insetting procedure in cartography. The calculation time of the above procedure is very long due to the fact that the datasets in
Share
Transcript
  A parallel searching algorithm for the insettingprocedure in Matlab Parallel Toolbox Dimitris N. Varsamis ∗ , Paris A. Mastorocostas ∗ , Apostolos K. Papakonstantinou † and Nicholas P. Karampetakis ‡∗ Department of Informatics & CommunicationsTechnological Educational Institute of Serres, Serres 62124, GreeceEmail: dvarsam@teiser.gr, mast@teiser.gr † Cartography and Geoinformation Lab, Department of GeographyUniversity of the Aegean, Mytilene 81100, GreeceEmail: apapak@geo.aegean.gr ‡ Department of MathematicsAristotle University of Thessaloniki, Thessaloniki 54124, GreeceEmail: karampet@math.auth.gr  Abstract —In this paper we present the implementation of aparallel searching algorithm, which is used for the insettingprocedure in cartography. The calculation time of the aboveprocedure is very long due to the fact that the datasets incartography are maps with large and very large resolution. Thepurpose of this proposal is to reduce the calculation time in amulticore machine with shared memory. The proposed algorithmand the performance tests are developed in Matlab ParallelToolbox. I. I NTRODUCTION C ARTOGRAPHY often deals with problems that confrontthe necessity of designing cartographic software tools.One of these problems faced by cartographers is the insettingprocedure. In this procedure mapmakers are placing insetsinto the maps, in order to tackle with the land discontinuityproblem [1], [2].Papakonstantinou et al. in [2], [3] presented a softwaretool called Inset Mapper (IM), that helps cartographers toovercome insetting procedure difficulties. In this approach, acomputational procedure following specific cartographic rulesfor calculating the appropriate position and the optimal scalefor an inset was presented [4], [5]. In this computationalprocedure, a modified searching algorithm is used in order tocalculate the number and the position of the proposed insets.This algorithm runs repetitively to calculate inset dimensionsuntil the computational process finds the optimal results. Boththe computational procedure and the searching algorithm weredeveloped in Matlab and were embedded into the IM softwaretool. The calculation time of this computational procedureand more specific the searching algorithm execution timedepends on the dataset size and is defined from a large degreepolynomial function. In very large geographic datasets theexecution time problem becomes more significant,therefore itis essential to study efficient ways of parallel processing in This work was supported by the Research Committee of the Serres Instituteof Education and Technology order to reduce it.Parallel processing is one of the most successful effortsto introduce explicit parallelism to high level programminglanguages [6], [7], [8]. This approach is selected becausemany useful computations can be conducted in terms of a setof independent sub-computations, each strongly associatedto an element of a large dataset, where such computationsare inherently parallelizable [9], [10]. Parallel processingis particularly convenient because it can scale easily tolarge problem sizes, such as searching into large geographicdatasets.The aim of this paper is to present a parallel processingapproach in order to reduce the calculation time of thecomputational procedure and the total execution time of theIM software tool, respectively. In particular, in section 2we present the serial searching algorithm. The theoreticalnumber of iterations and the theoretical computational time arecomputed as well. In section 3 the modification of the abovealgorithm that is implemented using parallel processing ispresented. Additionally, in this section we compute the corre-sponding theoretical number of iterations, while the theoreticalcalculation times for the parallel processing implementationsis also given. Finally, in section 4 we apply the implementedalgorithms using the parallel computing toolbox of Matlab[11], [12], [13] and a comparison of execution times forspecific geographic datasets is conducted.II. T HE S ERIAL S EARCHING A LGORITHM In this section we present the modified for the insettingprocedure serial searching algorithm, witch is based on thesequential searching algorithm. In particular, the modified al-gorithm is implemented using two double loops in order to findthe total number of inset placement positions. Additionally, wecompute the number ( L ) of iterations and the bounded intervalof  L . Finally, the computational complexity of this algorithmis given.Preprints of the Federated Conference onComputer Science and Information Systems pp. 615–621ISBNc  2012 615  Algorithm 1 Serial processing Input: A,m i ,n i Output: N,positions for i = 1 to m − m i + 1 do  j ← 1 while j ≤ n − n i + 1 do B ← A ( i to i + m i − 1 ,j to j + n i − 1) bj ← n i done ← true while bj ≥ 1 do bi ← 1 while bi ≤ m i doif  b ( bi,bj ) = 1 then done ← false pos ← bj + 1 Break  end if  bi ← bi + 1 end while bj ← bj − 1 end whileif  done = true then Count and Store position  j ←  j + 1 else  j ←  pos end if end whileend for The primary aim of this algorithm is to compute the number N  of the proposed inset map placement positions in a map usingpredefined inset dimensions. The geographic dataset (map) istransformed to a two-dimensional ( m × n ) numerical matrix,where n is the length and m the width of the dataset in pixels.This matrix has values, for every element zero correspondingto sea and one to land: a i,j =  1 , land 0 , sea (1)Let a matrix A ∈ K m × n and the matrix B ∈ K m i × n i , beingsub-matrix of A, where K = { 0 , 1 } . The dimensions of matrixB are m i × n i , where m i and n i are the length and the width(in pixels) of the proposed inset, respectively.In mathematical representation, the problem is to count (andstore positions) how many times the matrix B exists into thematrix A having all the elements equal to zero.The serial searching algorithm is described in Algorithm 1.In this algorithm the number of iterations depends onthe size of matrix A and B . Additionally, for both internaland external loops the number of iterations depends on theelements of matrix B , namely on the existence of the valueone. The external double loop is defined by the indices i and  j and the internal double loop is defined by the indices bj and bi , respectively. In the external double loop the numberof iterations for index i is constant and equal to m − m i +1 ,while the number of iterations for index j is between 1 and n − n i +1 . Thus, the maximum amount of iterations for externalloop is ( n − n i + 1) × ( m − m i + 1) .In the internal loop the respective amount is n i × m i iterations.Therefore, the maximum total number of iterations that areexecuted in the serial algorithm, where the inputs are A m × n , m i and n i , is given by L max ( n,m ) = ( n − n i + 1) · ( m − m i + 1) · n i · m i (2)Consequently, the theoretical complexity of the above algo-rithms is of  O  n 2 m 2  .Specifically, in the external loop, the number of iterations foreach row (index j ) depends on the coverage of the map, wherefor matrix A the coverage is given by C  = m  i =1 n  j =1 a i,j m · n where a i,j ∈ A m × n .Therefore, the number of iterations for each row is ( n − n i + 1) × (1 − C  ) .Similarly, in the internal loop the number of iterations dependson the coverage of map, thus it is equal to ( m − m i + 1) × (1 − C  ) .We conclude that the total number of iterations is L = ( n − n i + 1)( m − m i + 1) n i m i × (1 − C  ) 2 According to the following cartographic rules from [3], [14]we have that n · m 16 ≤ n i · m i ≤ n · m 8 In case where n ≃ m and n i ≃ m i , he have n 2 16 ≤ n 2 i ≤ n 2 8 and m 2 16 ≤ m 2 i ≤ m 2 8 or equivalently n 4 ≤ n i ≤ n √  8 and m 4 ≤ m i ≤ m √  8 (3)Therefore, the total number of iterations is enclosed L 1 ≤ L ≤ L 2 where L 1 =  ( √  8 − 1) n + √  8  ( √  8 − 1) m + √  8  nm 8 2 × (1 − C  ) 2 and L 2 =(3 n + 4)(3 m + 4) nm 4 4 × (1 − C  ) 2 The number of iterations is increased rapidly while the dimen-sions of map are increased (Figure 1a and 1b).In case where m = 200 and n = 200 , we have 0 . 19148 × 10 7 ≤ L ≤ 5 . 70025 × 10 7 616 PREPRINTS OF THE FEDCSIS. WROCŁAW, 2012  (a) L 1 (b) L 2 Fig. 1. The lower and upper bound of iterations while in case where m = 500 and n = 500 , we have 2 . 9781 × 10 7 ≤ L ≤ 220 . 9 × 10 7 The increase in L has as a result a significant influenceon the execution time of the serial algorithm and the totalcalculation time of the computational procedure of the IMsoftware tool.III. T HE P ARALLEL S EARCHING A LGORITHM The serial algorithm access all rows of the matrix A , andfor each row checks if there exists an inset map placementposition; this procedure is independent for each row. Also,the number of iterations for index i (rows) in the externalloop in matrix A is fixed. Thus, we can partition the loop forindex i (rows in matrix A ) into a ( m − m i + 1) independenttasks (Figure 2). These tasks can run simultaneously in aset of processors. Each of these tasks performs the sameprocedure for a different dataset. The architecture of parallelprocessing for this implementation is a Single InstructionMulti Data (Figure 3) with shared memory. From the abovewe conclude that the most simple way to parallelize the serialalgorithm [10], [9] is to parallelize the external loop for index i , because this loop has a constant number of iterations sincethis procedure can be applied to each row separately.The parallel searching algorithm is described in Algorithm 2.The number of iterations in each processor (row) is given by L  p = ( n − n i + 1) · n i · m i · (1 − C  ) 2 The computation time of the serial algorithm is given by T  1 = L · t c = L  p · ( m − m i + 1) · t c where t c is the computational time of each iteration.The computation time of parallel algorithm is given by T   p = ( L  p · t c + p · T  comm ) · ( m − m i + 1)  p Algorithm 2 Parallel processing Input: A,m i ,n i Output: N,positions for i = 1 to m − m i + 1 in Parallel do  j ← 1 while j ≤ n − n i + 1 do B ← A ( i to i + m i − 1 ,j to j + n i − 1) bj ← n i done ← true while bj ≥ 1 do bi ← 1 while bi ≤ m i doif  b ( bi,bj ) = 1 then done ← false pos ← bj + 1 Break  end if  bi ← bi + 1 end while bj ← bj − 1 end whileif  done = true then Count and Store position  j ←  j + 1 else  j ←  pos end if end whileend for where p is the number of processors and T  comm is the requiredtime for communication between CPUs and memory.Therefore, the theoretical speed up of the parallel algorithm is DIMITRIS VARSAMIS, PARIS MASTOROCOSTAS ET AL.: A PARALLEL SEARCHING ALGORITHM FOR THE INSETTING PROCEDURE 617  Fig. 2. Serial processing architectureFig. 3. Parallel processing architecture given by S   p = T  1 T   p = L  p · ( m − m i + 1) · t c ( L  p · t c + p · T  comm ) · ( m − m i +1)  p = L  p · t c ·  pL  p · t c + p · T  comm and the efficiency of this algorithm is given by E   p = S   p  p = L  p · t c L  p · t c + p · T  comm In case where T  comm = a · t c we have S   p = L  p · t c ·  pL  p · t c + a · t c = L  p ·  pL  p + p · a and E   p = S   p  p = L  p L  p + p · a A common practice for the insetting procedure in cartographyis to create insets with the dimensions’ proportions similarto the proportions of the selected area, in order to have abalanced design into the final result [14]. According to theprevious condition, for the realisation of the case study, weselected the inset dimensions to follow the ratio of the mapdimensions for arbitrary area to inset. We have to point outthat the efficiency E   p of the parallel algorithm depends on L  p ,  p and a . In particular, if the inset dimensions are m i = m 4 and n i = n 4 and a map has land coverage C  = 0 . 40 , then L  p =  n − n 4+ 1  · n 4 · m 4 · (1 − C  ) 2 =91600  3 n 2 + 4 n  m For a map with resolution 100 dpi the dimensions are m =400 and n = 400 , thus the number L  p is very large and theefficiency converge to one.IV. I MPLEMENTATION IN M ATLAB P ARALLEL T OOLBOX The performance tests for both serial and parallel algorithmsare implemented in Matlab Parallel Toolbox. In this toolboxthe loop command “parfor” is included, which implements theparallelization of simple loop command “for”. The parallelalgorithm is constructed in Matlab programming language 618 PREPRINTS OF THE FEDCSIS. WROCŁAW, 2012  Fig. 4. The island of Lesvos with inset with the appropriate features. The following performance testsare implemented in a personal computer using the softwareMatlab, having the specifications: Intel Core Quad CPU(Q9400) at 2600 GHz with 4 Gb RAM.In the performance tests we measure the execution time of theserial algorithm (1 core) and the parallel algorithm with twocores and four cores.The geographical dataset is the map of island Lesvos (Figure4) with resolution varying from 50 to 500 dpi, with coverage C  = 0 . 40 . According to (3) we study the following four casesfor the inset dimensions:1) m i = m 4 and n i = n 4 2) m i = 2 m 7 and n i = 2 n 7 3) m i = 3 m 10 and n i = 3 n 10 4) m i = m 3 and n i = n 3 In case (4) the number N  of inset placement position is equalto zero which means that the study of this case does not havecartographic interest. In the other three cases, the number of inset placement positions is unequal to zero and has the order N  3 < N  2 < N  1 . Therefore, the forthcoming analysis is takesinto consideration cases (1), (2) and (3).In Figures 5, 6 and 7 we present the execution time (inseconds) of the serial (1 core) and parallel (2 cores, 4 cores)algorithms for each case, respectively. An increase in themap resolution leads to increase in corresponding time. Theinset dimensions have an effect on the execution time becausein case (1) the serial algorithm (1 core) and the parallelalgorithm (2 cores) have the same time, but in cases (2) and(3) the divergence between one core, two cores and four coresincreases.The results of performance tests are hosted in Table Iwhere the speed up of the parallel algorithm is presented.Figure 8 depicts the efficiency of parallel algorithm.The speed up of the parallel algorithm for implementationwith two and four cores is greater than to one ( S   p ≥ 1 ) inall cases, except case (3) with two cores. Generally, we haveimprovement in the absolute execution time of the parallelalgorithm instead of the serial one.The speed up for implementation with four cores is greaterthan that of two cores in all cases. Thus, in a machine with Fig. 5. Execution times for serial (1 core) and parallel (2 and 4 cores)algorithms for the case (1)Fig. 6. Execution times for serial (1 core) and parallel (2 and 4 cores)algorithms for the case (2)Fig. 7. Execution times for serial (1 core) and parallel (2 and 4 cores)algorithms for the case (3) DIMITRIS VARSAMIS, PARIS MASTOROCOSTAS ET AL.: A PARALLEL SEARCHING ALGORITHM FOR THE INSETTING PROCEDURE 619
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks