Abel PerezMindplex MediaPerhaps the most important principle for the good algorithm designer is to refuse to be content
Aho Hopcroft and Ullman
I execute extremely accurate logic with my paint brush!You tell me who I am? Scala zipAll FunctionScala zipAll FunctionThe Scala zipAll function effectively takes two lists and generates a new collection that consists of corresponding pairs of elements that are found at the same position of both lists. In other words, each pair in the new list is the result of taking each element at the same position from both lists. For example, if the value at position 0 from the first list is "Abel" and the value at position 0 from the second list is 100, then the resulting pair would be ("Abel", 100). This pair would then be the element at position 0 of the resulting list from the zipAll function. The big difference between zip and zipAll is that unlike the zip function where the amount of elements contained in the resulting list is the same as the shortest list, zipAll compensates for the shortest list, by adding a default value as the element of the resulting pair for the shortest list. For example, if the first list contains the value "Abel" at position 3 and the second list only contains two elements, and the default value specified for y (the second list) is 100, then the pair at position 3 for the resulting list would be ("Abel", 100). This probably sounds very confusing so let’s jump right into an example to further illustrate. // the first list
val xList = List("Abel", "Anthony", "Junior", "Blas")
// the second list
val yList = List(5, 4, 7)
// apply the zipAll function to both lists and display the results
println(xList.zipAll(yList, "John", 100))The result of running this code is the following: List((Abel,5), (Anthony,4), (Junior,7), (Blas,100)) As you can see the zipAll function added the specified value of y 100 to the element "Blas" from the first list, simply because the second list is one element short than the first list. If this would have been the zip function, the result would have been the following: List((Abel,5), (Anthony,4), (Junior,7)) In the zip function variant, the pair for "Blas" would have been skipped altogether. As you might expect, the same applies to the reverse scenario where the first list is shorter than the second list. The following demonstrates the expected behavior. // the first list
val xList = List("Abel", "Anthony", "Junior")
// the second list
val yList = List(5, 4, 7, 2)
// apply the zipAll function to both lists and display the results
println(xList.zipAll(yList, "John", 100))Here is the expected output of running the reverse scenario. List((Abel,5), (Anthony,4), (Junior,7), (John,2)) Here we can see that the resulting list contains the default value of x "John" for the pair at position 4. The Scala zipAll function is extremely useful and flexible for zipping lists together. In the next part of this tutorial we demonstrate the third zipping function zipWithIndex. Scala Zip FunctionScala Zip functionIf you're a Java developer new to the Scala programming language and you have recently come across the zip function of the Iterable trait; odds are your scratching your head wondering what zip is. In this simple tutorial we will demonstrate what zip is and how it is used on a List. Zip is a fairly old concept found in functional programming languages. Zip effectively takes two Iterable collections, for instance "List" and combines the two into one list of pairs. The pairs are derived from taking each element at the same position in both lists and combing them into one pair. To illustrate this let's take a look at a simple example. // A simple list of peoples first names
val people = List("Abel", "Anthony", "Junior", "Blas")
// A simple list of lucky numbers
val numbers = List(5, 2, 7, 4)
// Here we zip the people and numbers lists and print out the resulting list.
println(people.zip(numbers))This code effectively constructs two lists. One list contains the names of people and the second list lucky numbers. On the last line we zip the people's list with the lucky numbers list and print out the resulting list of pairs. The output of this code is the following: List((Abel,5), (Anthony,2), (Junior,7), (Blas,4)) As you can see in the output, each element at the same position of both lists is combined into a tuple and added to the resulting list in the same order they appear in the specified people and numbers list. It's important to note, that if the specified lists to zip are not the same size, the resulting list will be the size of the shortest list. What this means is that zip only generates pairs for elements that are found in the same position available on both lists. For example, if the first list contains the elements ("Abel", "Anthony", "Junior", "Blas") and the second list contains the elements (5, 2, 7) then the resulting list would be: (("Abel",5), ("Anthony",2), ("Junior",7). As you can see the element Blas from the first list is not present in the resulting list. Scala has another clever function that deals with this issue, it's the zipAll function defined in the Iterable trait of the Scala collections API. The second part of this tutorial covers zipAll here. Let's jump into a simple example to further clarify. // A simple list of peoples first names
val people = List("Abel", "Anthony", "Junior", "Blas")
// A list that contains less elements than the peoples list
val numbers = List(5, 2, 7)
// Here we zip the people and numbers lists and print out the resulting list.
println(people.zip(numbers))And the result of this code is: List((Abel,5), (Anthony,2), (Junior,7)) As you can see the pair for the element "Blas" at position 4 is not present in the result. And that basically sums up what the zip function found in Scala's collections is all about. Hopefully this simple tutorial has helped you demystify the zip function, if not feel free to drop me a comment with any questions you might have. The second part of this tutorial covers the zipAll function. A SMALL INVESTMENT FOR A LARGE RETURNMy personal library. I use to work with this dude~!!!Hadoop Task AssignmentHadoop Task Assignment is the process of selecting a task that is part of a Map Reduce job and assigning it to a node (TaskTracker) that can execute the task. In this article, we will cover the fundamental steps that take place during the task assignment process. Map Reduce jobs in Hadoop are configured and submitted by clients to the Hadoop job tracker (JobTracker) system. The job tracker maintains a priority list of submitted jobs and tracks the progress of each job. Clients can configure Hadoop jobs with a priority level that represents the importance of the job relative to other jobs in the queue. By default the job priority list is a first-in first-out (FIFO) queue. The job priority mechanism exists in order to give submitted jobs an opportunity to execute in a timely manner. It’s possible to submit long-running jobs that share the entire Hadoop cluster and as a result, cause pending jobs to block for a very long time. Setting a high job priority level helps execute jobs in a fair amount of time. For example, a small job should not have to wait a long time to execute because of many long-running jobs that were scheduled before it. Instead a high priority on the small job would allow it to execute before the next long-running job is processed. It’s important to note that the priority level mechanism does not guarantee that high priority jobs run in a reasonable time. A high priority job can still block waiting for its turn to run, due to a low priority long-running job that was started before the small job was scheduled. The default FIFO scheduler is not incredibly effecient, however Hadoop supports other scheduling algorithms that attempt to solve this problem i.e., FairScheduler and CapacityScheduler. Now that we know a little about how Map Reduce jobs are scheduled in Hadoop lets examine how the tasks that make up a Hadoop job get assigned to nodes for execution. In this context the term task is used to describe the actual Mapper and Reducer tasks that make up Hadoop jobs. The Hadoop TaskTracker is the system that lives on each node of the Hadoop cluster and is responsible for executing tasks. The TaskTracker communicates with the JobTracker through a Heartbeat Protocol. In essence the heartbeat is a mechanism for TaskTracker’s to announce their availability on the cluster. The heartbeat let’s the JobTracker know that the TaskTracker is alive. In addition to announcing its availability, the heartbeat protocol also includes information about the state of the TaskTracker. Heartbeat messages indicate whether the TaskTracker is ready for the next task or not. When the JobTracker receives a heartbeat message from a TaskTracker declaring its is ready for the next task, the job tracker selects the next available job in the priority list and determines what task is appropriate for the TaskTracker to execute. TaskTrackers are constrained with the number of tasks they can execute. For example, a task tracker might be configured to only process two Mapper tasks and two Reducer tasks in parallel at any given time. For these reasons, the job tracker has to figure out what type of task to assign the TaskTracker. If there is at least one slot for a Mapper task, then the TaskTracker get’s a Mapper task assigned; otherwise a Reducer task. The number of tasks a TaskTracker can execute simultaneously depends on the number of cores and memory available on the node the TaskTracker is running on. Hadoop try’s to be efficient when processing tasks by considering the locality of the data the task will process. Every Mapper task in a Hadoop job is assigned an input split and the data from that input split could be located anywhere in the HDFS cluster. The scheduler is smart and first attempts to match a task with a node that contains the data locally. When the data to be processed is local to the node executing the task, the TaskTracker node is alleviated from having to download the required data from a remote node. When a task can’t be matched with a node that contains the data locally, Hadoop will attempt to find the closest TaskTracker node to the data. In this context the closest node would be a node on the same rack where the data is stored. Finally, if a node in the same rack cannot be found, then the alternative is to find a node in another rack. Once this process is complete the actual task if assigned to the target TaskTracker. It turns out that with this process in place, about 70% of the time, tasks run on nodes where the data is local. Data locality ensures better performance and efficiency. To recap, task assignment is composed of job selection based on a job priority list. A heartbeat protocol is used to announce the ready state of TaskTrackers. Tasks with a job are selected based on available slot in the TaskTrackers i.e., Mapper and Reducer tasks. And lastly, data locality is considered for efficient task assignment. As you can see, the Hadoop Task Assignment process is a fairly sophisticated. Amazon EC2 Experiences Heavy OutageToday Amazon encountered issue with their EBS system causing their US East datacenter to go down. The service interuption affected some high profile websites like Foursquare, Reddit, Quora, Hootsuite, and Heroku. It's amazing how many sites depend on Amazon for affordable scalability. The pressure must be extremely high for Amazon engineers to maintain a 99.95% system uptime. I'm sure today was a hectic day for them Amazon catz~!!! Erlang Dictionary ExampleChances are your looking for examples on how to use a Dictionary in Erlang. In this tutorial we will cover the “dict” module, which is one of the more common dictionary data structures found in the Erlang programming language. We will attempt to illustrate the various ways to manipulate a dictionary through the “dict” module. Erlang bundles with various dictionary based modules but in this tutorial we will focus on the “dict” module exclusively. The dict module is a fairly complete implementation of a key-value dictionary. Through the dict module you can expect to find functions for adding, appending, removing, and updating key-value pairs in a dictionary. Functions for getting the size of the dictionary and checking for the existence of keys are also provided. The dict module also contains functional operations like fold, filter and map. Let’s jump right into the dict module and explore the functions available. First lets explore how to create a dictionary and store key-value pairs in the dictionary with the new function and store function. Here we will create a dictionary and add the key: color and the value: blue to the dictionary. Note, we don’t display the output from functions that create dictionaries in order to keep the examples noise free. However, we will display any output that illustrates the purpose of a function we are exploring. 1> D = dict:new(). 2> D1 = dict:store(color, blue, D). At this point, we have created a new dictionary D and proceeded by storing the key-value color:blue in the dictionary. Notice that most functions in the dict module require a dictionary as an argument and return a new dictionary. If your coming from an imperative language like Java you might be wondering why this is the case. Since Erlang is a functional language it focuses on the concept of not storing state, so each time you mutate a dictionary by adding or removing elements, the dict module simply returns a new instance of the dictionary. This makes the dictionary process-safe and allows for consistent concurrency. Now lets enhance our example by adding a key-value pair were the value is a list of items. Once we have a key that maps to a list of elements, we will append new values to the list. We will use the append function for adding multiple values to a given key. 3> D2 = dict:store(colors, [black], D1). 4> D3 = dict:append(colors, white, D2). Our example dictionary now contains the following pairs: color:blue, colors:[black,white]. Now we are at a good point to illustrate how to fetch values from the dictionary that is associated with a given key. Lets demonstrate how to fetch the values from the key “colors”. Also, be aware that if the key specified in the fetch function is not present in the dictionary, an exception will be raised. 5> io:fwrite("values for key: colors: ~p~n", [dict:fetch(colors, D3)]).
values for key: colors: [black,white]As you can see our dictionary contains the expected values: black and white for the given key: colors. Lets go a small tangent and demonstrate how to get the size of the dictionary before we continue to illustrate how to add and remove pairs. 6> io:fwrite("dictionary size: ~p~n", [dict:size(D3)]).
dictionary size: 2
okThe size function effectively displays the size of the dictionary. Lets go back to adding values to the dictionary. A useful function of the dict module is the append_list function. This function lets you add a list of elements as the value of a specified key. For example, we defined the key colors and added the values black and white, now lets append the values green and orange. Once we append the values we will follow with a fetch operation on the dictionary for the key colors to show verify that our new values were successfully appended to the dictionary. 7> D4 = dict:append_list(colors, [green, orange], D3).
8> io:fwrite("values for key: colors: ~p~n", [dict:fetch(colors, D4)]).
values for key: colors: [black,white,green,orange]
okWow that couldn’t be any easier? Okay we are half way through the dict module, stay with me and lets blow this module out of the water. Now lets move on to removing elements from the dictionary. The dict module contains the erase function for deleting all elements from the dictionary that match the specified key. For this example, we will add a new key-value pair to our dictionary, fetch the value associated with the new key, display the size of the dictionary, delete the newly added key-value pair and lastly display the size of the dictionary. 9> D5 = dict:store(shape, square, D4).
10> io:fwrite("value for key shape: ~p~n", [dict:fetch(shape, D5)]).
value for key shape: square
ok
11> io:fwrite("dictionary size: ~p~n", [dict:size(D5)]).
dictionary size: 3
ok
D6 = dict:erase(shape, D5).
12> io:fwrite("dictionary size: ~p~n", [dict:size(D6)]).
dictionary size: 2
okAs expected, we can see in this example how we were able to store a new key-value pair, then fetch it, and see our dictionary grew in size by one. After deleting the newly added key-value pair with the erase function, our dictionary decreased in size by one. Just for completeness, lets cover the find function. The find function is similar to fetch, it gets the value associated with a given key. If the key provided is not available in the dictionary, an error is returned. The return value of the find function is a tuple {ok, Value} or error, if the given key is not in the dictionary. 13> io:fwrite("value for key color: ~p~n", [dict:find(color, D6)]).
value for key color: {ok,blue}
ok
14> io:fwrite("value for key color: ~p~n", [dict:find(square, D6)]).
value for key color: error
okHere we can see how the find function was able to find the value of the specified key in our dictionary. And when we tried to find a key that did not exist in the dictionary we received an error as the return value. Basically, the difference between fetch and find is that find is less intrusive, in the sense that it will not raise an exception when the specified key is not in the dictionary whereas fetch will. Next, lets explore how to list all the keys contained in our dictionary. The dict module contains the fetch_keys function that does exactly that. The fetch keys function returns a list with all the keys in the dictionary. 15> io:fwrite("store keys: ~p~n", [dict:fetch_keys(D6)]).
store keys: [color,colors]
okNow we go full circle and back to creating dictionaries. The dict module contains a useful function for creating dictionaries from a given list of tuples. The tuples in the list take the form of {Key, Value}. Lets jump right into the from_list function and see it in action. 16> D7 = dict:from_list([{people, abel}, {people, anthony}, {animals, tiger}]).In this example we killed two birds with one stone. We construct a new dictionary from the list of tuples. Notice in the list of tuples, we defined two tuples with the same key: people. The end result is that our newly constructed dictionary will contain two keys i.e., people and animals. With the key people containing two values: abel and anthony. In other words the from_list function is smart enough to detect multiple values for the key people, cool isn’t it? And there you have it, we covered most of the functions available in the dict module (new, store, append, append_list, fetch, find, size, from_list, erase, and fetch_keys) with the exception of fold, filter and map. In the second part of this tutorial we will cover these functions. Raising Erlang ExceptionsErlang supports several ways to raise exceptions. Each method of raising Erlang exceptions maps to a logical concept related to error handling. It’s important to know the different ways to raise exceptions in order to ensure your application behaves correctly and is free from unexpected errors that can cause your application to die when it should not and vice versa. Lets examine the three common ways to raise exceptions in Erlang. The exit(Why) function can be used to raise exceptions when you need the current process to terminate. Exceptions triggered through the exit(Why) function also cause any associated processes to the current process to terminate unless the exception is properly caught. The error message that propagates to all the linked process is in the form of: {‘EXIT’,Pid,Why}. The next exception raising function is throw(Why). The throw function is useful for throwing errors that should not necessarily terminate your process or application. For example, a user-defined function might throw this type of exception to allow the invoker to handle the exception and gracefully recover. In other words, this kind of exception is not severe in the case that there is no possible way to recover from, like a lost network connection or detached storage device. Some user might even choose to silently ignore these types of exceptions. Lastly, the erlang:error(Why) function can be used to raise exceptions that are severe and cannot be recovered from. Simply put, these types of exceptions can be considered internal errors. Now that you know the fundamentals about throwing Erlang exceptions your applications can be written in a better way. Example of raising an Erlang exception with exit(Why). -module(exceptions). -export([sample_error/0]). sample_error() -> throw(“Shit what happened?”). Now lets compile our exceptions module, invoke the sample_error() function and observe the output of the raised exception. erlc –o ebin src/exceptions.erl
erl –pa ebin
1> exceptions:sample_error().
** exception throw: "Shit what happened? "
in function exceptions:sample_error/0As you can see our example function threw the expected exception with the argument we passed into the throw(Why) function as the error message. |
|